AI chatbots are becoming a default part of customer support, but knowing what to expect from one in practice — which questions it handles well, how customers react to AI-generated answers, and where a human agent is still essential — is harder to figure out before you deploy.
We ran a three-week experiment on our own support chatbot to answer exactly that. 2,498 real tickets, direct customer feedback, and manual agent evaluation of every response. Below is the full breakdown: the results, the real ticket examples, and the honest conclusions for any team weighing the same decision.
- First response time dropped from 1–2 days to 2–5 minutes
- 89% of AI responses contained useful information
- 17% of tickets were fully resolved by the AI (without agent follow-up)
- Only 5% of responses were incorrect or unhelpful
- The AI excels at informational queries (pricing, GDPR, troubleshooting)
What Resolution Data Shows
The first question anyone evaluating AI chatbot software for customer support asks: Does it actually work? Here’s what 2,498 tickets told us.
The headline numbers
We deployed an AI-powered chatbot on every incoming support ticket at Elfsight and measured quality through two layers: direct customer feedback via CTA buttons and manual agent review of tickets where customers didn’t respond.
89% of AI responses contained useful information. Nine out of ten times, the chatbot gave the customer something they could work with — a direct answer, a troubleshooting path, a relevant explanation.
17% of tickets were fully resolved by the AI alone. For context, Gartner’s research found that only 14% of customer service issues are fully resolved through self-service across industries — so 17% is above the industry baseline. Vendor-reported benchmarks run higher (Intercom averages 41%), but those typically reflect enterprise deployments with mature knowledge bases.
Only 5% of responses were incorrect or unhelpful. The AI misread the question, gave bad advice, or missed the point entirely.
The response time shift
Even “trendsetter” support teams average 30-minute resolution times, while slower teams can take over 36 hours. – Freshworks 2025 benchmark report.
Before the chatbot, customers received a generic auto-reply (“We’ve received your request and will get back to you shortly”) and waited 1–2 business days for a response – not unusual for a SaaS team with a broad product catalog, but not great for customers with urgent issues either.
After deployment, the first useful response time dropped to 2–5 minutes. Not first acknowledgment — first tailored, relevant answer addressing the customer’s specific question. That shift turned out to matter more than resolution rate.
How AI Support Chatbot Actually Works
Before diving into the details, here’s how a typical AI chatbot for customer service operates — using our deployment as a practical example.
What the customer experiences
A customer submits a support ticket. Instead of a generic auto-reply, they get a tailored answer pulled from the company’s knowledge base within minutes. Not a canned template or a link dump, but a response that addresses exactly what they asked about.
In our case, the chatbot advises but doesn’t act — that’s something AI agents do. It can explain a pricing plan, troubleshoot a technical issue, or clarify data privacy options. When the problem requires action, the AI explains the situation, tells the customer what information the support team needs, and hands off the conversation.
What it can’t do: Process refunds, change subscriptions, access accounts, debug site-specific display issues (it can’t poke around your website’s code)
The knowledge base: what makes or breaks it
The quality of an AI chatbot’s responses depends almost entirely on the documentation feeding it. This is true regardless of the underlying model — a powerful LLM with a poor knowledge base will produce confident-sounding but unhelpful answers.
In our deployment, the knowledge base has two layers:
The first covers high-volume topics: pricing, subscriptions, refunds, GDPR, and navigation — questions that make up much of the queue and have standardized answers.
The second goes deeper with app-specific documentation: editor guides with annotated screenshots and troubleshooting docs covering known issues, technical limitations, and common edge cases.
The quality difference is stark. Early on, some base files were essentially rephrased product pages — feature lists and marketing copy. The AI’s responses based on those files were bland and generic. The files that actually worked were written by people who’d read hundreds of support tickets and knew the real questions customers ask.
Transparency and human oversight
Two design choices that matter for any customer service chatbot deployment: customers should always know they’re talking to AI, and human agents should always have the final say.
In our setup, every AI response starts with a clear disclosure and ends with three feedback buttons: “All good — close the ticket,” “Helpful, but I still need help,” and “Not helpful — talk to support.” Agents still review every ticket — the AI runs ahead of the queue, not instead of it. Whether the issue is already resolved, partially addressed, or completely off-base, a human always has oversight.
The knowledge base also isn’t static. It’s updated whenever agents spot incorrect responses or when the product changes. Left to its own devices, a language model will fill gaps with plausible-sounding nonsense. A tightly maintained knowledge base keeps it honest.
What AI Support Chatbots Handle Well vs Where They Fall Short
Not all support questions are created equal. Understanding which types of queries AI handles well, and which still require a human, is essential for setting realistic expectations.
| Question type | AI coverage | Why |
|---|---|---|
| Pricing and plan details | ✅ Yes | Clear, standardized answers in knowledge base |
| GDPR / data processing | ✅ Yes | Policy-based, well-documented |
| Product troubleshooting | ✅ Yes | Step-by-step fixes covered in knowledge base |
| Setup and configuration | ✅ Yes | Documented workflows with clear instructions |
| Billing and subscription changes | ❌ Agent | Requires access to billing systems and account data |
| Refunds and retention | ❌ Agent | Requires judgment and manual processing |
| Site-specific display issues | ❌ Agent | Requires inspecting the customer’s actual environment |
| Account actions (transfers, recovery) | ❌ Agent | Requires manual operations in admin tools |
Where AI chatbots perform best
The pattern is consistent: if a customer’s question boils down to “explain this to me” or “help me set this up,” and the answer exists in the knowledge base, an AI chatbot delivers. Informational queries with clear, documented answers are where AI earns its keep.
In our data, the highest success rates came from GDPR and data processing queries, pricing and plan breakdowns, and product-specific troubleshooting for well-documented features. The common thread is that these are specific, recurring questions with standardized answers.
Where human agents are still essential
The limitations are just as consistent — and they apply to any AI chatbot for customer support, not just ours. These are categories in which the task requires system access, human judgment, or investigation that AI simply can’t perform.
The underlying pattern generalizes well: if the answer is information, AI handles it. If the answer is an action, a human is still needed.
What AI Customer Support Looks Like in Practice
Data is useful, but seeing the actual interactions is what makes the difference tangible. Below are customer service chatbot examples—real tickets from our experiment—two where the AI fully resolved the issue, and one where it couldn’t but still added value.
When the AI resolves it completely
Example 1: “Why isn’t my new Google review showing up?”
Customer details anonymized. AI reply compressed for readability.
Customer: "I got a new 5 star review Monday or Tuesday (today is Thursday) that is already showing on my Google Profile. How come it isn't pulling into the widget?"
AI Assistant: The most likely reason your new 5-star review isn't showing yet is our update (cache) interval. Our Google Reviews widget doesn't pull new reviews in real time — reviews in the widget are refreshed every 72 hours. There is no way to clear or refresh the cache manually from your side, because updates run automatically.
If more than 72 hours have passed and the review is still not visible, please check in your widget editor: make sure you're not excluding reviews accidentally via filters on the Settings tab, and confirm the minimum rating is not set higher than 5.
If everything looks correct and it's been more than 72 hours, please reply to this email and our Support Team will take a closer look.
Customer feedback: "Yeah the AI response was epic! Thanks sir."
A question like this would have sat in the queue for a day or two before an agent gave essentially the same answer. Instead, the customer had a clear explanation in minutes.
Example 2: “Facebook won’t connect, and TripAdvisor isn’t pulling reviews”
Customer details anonymized. AI reply compressed for readability.
Customer: "Setting up a new all in one reviews widget for a client who already has reviews on several sites. Google Reviews came right in, but Facebook won't connect and TripAdvisor hasn't brought any in."
AI Assistant: For your All‑in‑One Reviews widget, Google is working as expected, so the installation itself is fine. The issue is likely with how the Facebook and TripAdvisor sources are added.
Facebook: make sure you're using just the base page URL (without /reviews), that only .com domain links are used, and that the page is fully public with no visibility restrictions.
TripAdvisor: add a direct link to the business's listing page, not a search results page. The widget may need a few minutes to connect and fetch data after you add the source.
Also check your widget filters — make sure "Show only reviews with photos" is disabled and the minimum rating filter isn't hiding results.
Customer feedback: "Yes, the reviews are loading now. I did need to take the 's' off the https:// page URL so it would connect. And finally after a couple of hours, the TripAdvisor displayed. Thank you for your helpful AI response system with extra troubleshooting."
When the AI can’t resolve — but still helps
Not every ticket can be closed by a chatbot. Yet that doesn’t mean the AI adds no value. In many cases, the most useful thing it does isn’t answering the question — it’s preparing the ticket for the human who will.
Example 3: “I can’t log in, but your subscription is charging my card”
Customer details anonymized. AI reply compressed for readability.
Customer: "I used this to help out an older client. I no longer work for the company, and don't have access to my company email to sign in. I need to be able to sign in and cancel the subscription, change the payment account, or just change it to the free plan so that I'm not personally charged."
AI Assistant: When you can't access the email tied to your Elfsight account, our Support team will need to help with account recovery. To locate the right account faster, please reply with: the login email you think was used, a bank/card statement showing the Elfsight charge, and a direct link to the page where the widget is installed.
Customer: Provided every piece of requested information in their next reply — email, payment proof, widget URL — all organized in one message.
Even in the 78% of tickets the AI couldn’t fully resolve, it consistently asked the right clarifying questions — requesting relevant details and context, just like a human agent would. It emerged from how the knowledge base was written: troubleshooting docs naturally include “if this doesn’t solve it, here’s what to share with support” sections, and the AI picked up on that pattern.
Measuring AI Support Quality
One of the most important things we learned is that measuring AI chatbot performance is harder than it looks, and relying on a single metric will mislead you.

The feedback silence problem
Only 12% of customers clicked a feedback button after receiving an AI response. The other 88% either didn’t interact at all (66%) or replied via email, bypassing the buttons (22%).
This isn’t a flaw in our setup — it’s a known pattern. Analysis of over 170,000 customer service chat sessions found nearly the same thing: only 16.2% of customers left feedback. When researchers modeled the likely satisfaction of the silent majority, they found that over half would have been dissatisfied, meaning feedback buttons systematically overrepresent extreme reactions.
Of our 12% who did click:
- 33% chose “All good”
- 25% said “Helpful but need more help”
- 42% clicked “Not helpful”
That 42% looks alarming until you realize it’s 42% of a self-selected 12% — heavily skewed by the fact that unhappy customers are far more motivated to click.
Why agent review matters
This is why we added a second evaluation layer. Support agents manually reviewed 1,685 tickets where customers hadn’t weighed in. Their assessment: 17% fully resolved, 78% useful but needed agent follow-up, 5% incorrect.
That’s a 95% useful-or-better rate from the people who know what a good support answer looks like. Combined with the customer CTA data, it produced the overall 89% useful figure.
- Two evaluation layers: Customer feedback (CTA buttons) + agent expert review
- Analysis period: 3 consecutive weeks
- Total tickets: 2,498 processed, 515 excluded as irrelevant, 1,983 analyzed
- Key lesson: Customer feedback alone skews negative. Always pair it with expert assessment.
If you’re deploying an AI chatbot, build in multiple ways to measure quality from the start. A single CSAT score or resolution rate won’t give you the real picture. For a deeper dive into which metrics matter most, see our chatbot KPI guide.
Honest Limitations to Know Before Deploying
No tool is perfect, and overselling the capabilities of a customer support chatbot is worse than underselling them. Here’s what to be realistic about.

Agent workload probably won’t drop
In our deployment, agents still review every ticket. The AI is consultative — it provides information but doesn’t take action. As a result, the support team’s daily workload hasn’t decreased in any meaningful way. The improvement is entirely on the customer side: faster, better first responses.
This is a common pattern for early-stage AI support deployments. Reducing agent workload requires giving the AI the ability to do things — process changes, trigger workflows, update records — not just say things. That’s a different level of trust and integration to build.
The 78% “middle zone” is the norm
The majority of AI-handled tickets, 78% in our case, fall into a gray area: the response is helpful but not sufficient on its own. The customer still needs an agent to finish the job. If you’re expecting an AI chatbot to close most tickets without any human intervention or oversight, you’ll be disappointed. If you’re expecting it to give customers a useful head start while your team catches up, you’ll be satisfied.
Knowledge base quality is an ongoing cost
A chatbot is only as good as the documentation feeding it, and that documentation requires continuous maintenance. Every product update, every new edge case, every incorrect response flagged by an agent means the knowledge base needs to be updated. This isn’t a one-time setup cost — it’s an ongoing operational commitment.
Frequently Asked Questions
What is a customer service chatbot?
What features should I look for in AI customer support chatbot?
What are the benefits of using a chatbot for customer service?
How can I measure my customer service chatbot's performance?
How do I add an AI chatbot for customer support to my website?
The Bottom Line
An AI chatbot won’t replace your support team — but it will change what customers experience in those first critical minutes after they reach out. In our data, that shift from days-long silence to a useful response in under five minutes mattered more than how many tickets the AI could close on its own.
The catch is that none of this works without a well-maintained knowledge base. Generic documentation produces generic answers; scenario-specific troubleshooting guides built from real ticket patterns produce responses customers actually trust. That’s an ongoing commitment, not a one-time setup.
Go in expecting faster first responses and better-prepared tickets for your agents, and an AI chatbot delivers. Go in expecting ticket volume to halve overnight, and you’ll be recalibrating within a month. The strongest argument for deploying a customer service chatbot might not be resolution at all — it’s the triage effect: even when the AI can’t solve the problem, it accelerates the solution.

