Notes on AI Customer Service Agents
Salesforce has agreed to acquire Fin (formerly Intercom) for $3.6bn. The deal is a useful signal of the value of customer service AI agents like Fin. Customer service is one of the primary places where AI agents are having a direct impact on everyday life. For most people, customer service has meant queues, ticket numbers and template replies. Great service has normally been expensive. A personal shopper. A private banker. A hotel concierge. Someone who knows the context of your situation and does the next useful thing. AI agents change the economics. They make it possible to give more people a version of the service that used to be reserved for high-value customers. And interest in AI agents more broadly has climbed to an all-time high.
Customer service is a clear first proving ground for AI agents because it combines language, tools, measurable outcomes and real customer pain.
What is a customer service AI agent?
A chatbot answers. An agent acts. That is a helpful distinction from how large language models have previously been used for customer service chatbots. A chatbot for an online store can tell you the return policy. An agent can check the order, apply the policy, trigger the refund and hand off to a human if required. More formally, an agent is a large language model (LLM) with instructions, tools and guardrails. Agents can reason through ambiguity and handle multi-step workflows with a high degree of autonomy. That sounds abstract, though by narrowing down to specifically customer service we can make it more concrete:
- Can the agent cancel the booking?
- Can it change the delivery address?
- Can it process the claim?
- Can it retain the customer?
- Can it know when not to act?
Easier to see than to describe. Below, a support agent takes one of these requests end to end – watch what it does between the customer's message and the reply.
Customer service is an early test of agent effectiveness
Customer service is a good fit for agents because it is messy enough to need language, structured enough to measure and already contains workflows, systems and relatively clear outcomes. Good service has to understand what the customer wants, check policy, use internal systems and decide whether to act or escalate. Agents map well to this kind of work.
This alignment is why several serious players are building customer service AI, and why these early movers are scaling fast. Decagon describes itself as “the AI concierge for every customer” and says its agents are built to deliver personalised experiences across the customer lifecycle. It also announced a $250m raise in January 2026, tripling its valuation to $4.5bn, with more than 100 new global enterprise customers added over the prior fiscal year. Sierra, an AI agent company founded by former co-CEO of Salesforce (and OpenAI chair) Bret Taylor, serves over 40% of the Fortune 50 and raised $950m in May 2026 at a $15bn valuation. Serious money is going into proving these agents deliver real value.
A useful Sierra example of providing value with AI agents is with Ocado, a UK online grocer and the world's largest dedicated online supermarket. Sierra doubled containment (the share of customer conversations handled without a human handoff) versus Ocado's previous chatbot and forecast positive ROI within a few months. The approach was simple – map customer interactions on a volume-versus-complexity matrix and begin with high-volume, lower-complexity tasks like delivery estimates and refund processing. This time-to-value focus shows the right deployment pattern: start with frequent, bounded work where the result can be measured.
Different approaches among the AI agent companies
Different companies are choosing different entry points. Gradient Labs (a UK-based AI agent company founded by former Monzo staff) is going vertical, focusing on financial-services firms (e.g. Wise) and their specific focus areas like disputes, KYC and lending. Sierra engages largely with established players across a wide range of industry sectors. And Decagon has more of a mix of newer tech-enabled customers and older enterprises using their AI agents. The common thread across their customers is companies who handle lots of similar-but-not-identical service interactions.
From demo to production
The first version of an agent can look impressive quickly. A demo can answer a support question. An agent interacting with customers needs permissions, policy logic, escalation rules, memory, testing, monitoring and analytics. One example of an issue that needs to be managed is drift between the agent's instructions and new information like new products, changes to the website and so on. It's like managing a team – move people onto a new area without training and performance drops. Agents are no different.
One useful Decagon concept is the Agent Operating Procedure (AOP). An AOP is an operating manual for the agent. This shared document is where different teams agree on how the company's agents should behave. Building these agents means deciding:
- What data and tools can the agent access?
- Who checks performance?
- How are instructions updated?
Agent solutions are not one-off software deployments. They need to adapt as the business changes. That shapes the engagement between the agent providers (Fin, Sierra, Decagon et al.) and the enterprises they support. The AI companies are long-running operating partners, often with forward-deployed teams that continuously build on the value agents deliver. Forward-deployed teams will often begin with something like a few-months-long, paid proof-of-concept (POC). The goal is to get something integrated and in front of real customers during the POC. Agents are hard to judge from a demo. You only really learn from live traffic, actual customer engagement and real edge cases.
Concierge service at scale
A good agent should know the customer's history, understand the current problem and take the next useful step. It should not force the customer to repeat the same details across chat, email and phone (context should be shared across each surface). It should not just link to a general help centre article but instead deliver the specific outcome the customer wants.
The concierge framing is useful. A concierge does not just answer questions. They solve small problems before they become big ones. They remember preferences. They know when to act and when to ask for approval. That is the target level of service for the companies building customer service agents.
The agent layer is becoming multi-model and multi-channel. Agents need to respond through voice, email and SMS. One workflow may work best with one model, while another needs a different setup. The tools from Fin, Sierra and Decagon have to flex across service situations like applying membership perks, extending a rental, rebooking a spa appointment.
At the same time, there is a rough 80/20 rule when building these customer service solutions. Much of the structure can be standardised: how to greet, how to escalate, how to follow policy. But the last 20% matters. That is where brand, product details and local context live. That is where the experience starts to feel less like software and more like service.
Sierra leans hard into this; they've drawn on hospitality thinking (including bringing in advice from Will Guidara, the restaurateur author of Unreasonable Hospitality) to shape how agents behave. The ambition is that agents can make a personal experience cheap enough to offer to many more end customers.
Anecdotally, one aspect I've heard from people interacting with customer service is that they appreciate the speed AI agents can act on requests. Slow support tells the customer their problem is not important. Fast, contextual support tells them the company does not want to waste their time. This is likely linked to findings that customer satisfaction scores can be higher than they were before LLMs handled conversations directly.
Pricing
Pricing is one of the clearest signs that AI customer service is different from older software-as-a-service (SaaS).
| Model | Example | Strengths | Cons / risks |
|---|---|---|---|
| Seat / SaaS | Previous generation of software | Predictable, easy to budget, familiar to procurement | Can be misaligned as AI removes the “person logging in”; the incentive can be to sell seats, not get the job done |
| Task | Per-conversation pricing model often used by Decagon | Scales with usage and easier to attribute than seats | You pay for failed conversations too; can lead to a focus on usage rather than resolutions |
| Outcome-based | Charged when queries are resolved, aiming to align vendor incentives with customer outcomes. Seen with Fin and Sierra pay-per-successful-resolution approaches | Potential vendor and customer win-win scenario | Can be operationally and contractually complex. Weaker when impact is hard to attribute |
Outcome-based pricing looks like a strong fit. Care is needed however, to ensure that outcome pricing does not create a fight over definitions. What counts as a resolution? What happens when the agent does 75% of the work, gathers the right information, and hands off to a human because the final step is too risky in regards to achieving the agreed-upon “positive” outcome? In that case, resolution-based pricing can distort the agent's value. There's a trade-off here when compared with per-conversation pricing. Task-based per-conversation pricing can be more resilient to these sorts of unintended consequences, while outcome pricing can more visibly align vendor and customer incentives.
Evals as performance review
Regardless of if AI agents are priced per outcome, vendor and customer need a shared agreement of what success should look like, and agent performance has to be measured against it. If agents are doing work people used to do, the performance review can be a useful analogy.
For a customer service agent, the performance scorecard might include:
- Containment rate
- Customer satisfaction
- Time to resolution
- Re-contact rate
- Policy breaches
- Refund rate
- Cost per resolved issue
- Revenue saved or generated
Containment rate is especially important but it has to be measured carefully, because a conversation that ends without escalation is not always a successful resolution. The customer might have simply given up.
Within the space, pricing and eval are tightly linked:
- Charge for resolved issues, you need to define how you'll evaluate resolution
- Charge for retained customers, you need to define how you'll evaluate retention
- Charge for completed workflows, you need to check the workflow actually helped
An interesting shift is that the eval layer itself is becoming agentic. Decagon's Watchtower acts as always-on QA, reviewing conversations continuously for signals like bugs or feature requests. Decagon's Duet tool then points to the next layer: an internal agent that can analyse conversations and recommend AOP changes. In other words, we're moving toward systems where not only is customer service done by AI, but improvements to customer service are increasingly driven by AI too.
The job change
A shift coming in customer service is that more people will manage work instead of doing tasks themselves. That means briefing agents, reviewing outputs and improving the system. The individual contributor starts to look more like a manager.
The Decagon agent product manager role is a useful glimpse of this future. The role involves shaping the workflow, reviewing AI performance, spotting gaps and improvement opportunities. That feels like the direction roles are heading.
Sierra's Agent Strategist role is another sign of where this is going. The strategist helps decide what the agent should do, how it should behave, which workflows to start with and how to improve it once it is live.
Other players have deployment strategist roles. Having “deployment” in the title is revealing. It's key to decide where the agent should work first and how the rollout should change the customer's operations. The strategist supports with deciding how success is measured and where the next expansion should happen. The strategist is managing the system.
These strategists often start with one painful workflow in a land-and-expand approach. If it works, the natural next question is: where else can this agent workforce go? That is where the deployment role matters. Someone has to keep finding the next workflow, proving the ROI and making sure the engagement can grow, increasing the benefit from leveraging these agents.
Working with agents is a form of delegation. If you were asking a person to do the task, what would they need? The goal, examples of what good looks like, access to the right systems. Agents need the same. Some tasks stop being done by people. New roles appear around agent operations, context design and governance. Strong customer-service teams will be the ones that learn to manage agents well.
What to watch for next
The current experience is uneven. In my own testing, questions were handed to a human when needed and responses were fast. But it still felt early in places (multimodal capabilities, for instance, were limited). The direction is impressive, there's more to come.
A next frontier is proactive service. Most customer service starts when something has already gone wrong. A customer complains, waits, follows up, waits again. Agents make a different pattern possible. If the system knows a delivery is late, a booking is at risk or a rental needs action, the agent can reach out first. That is closer to how great human service works. The customer does not have to notice every problem before the company does.
Sierra's Bret Taylor goes further, predicting that the vast majority of digital interactions will happen through an agent, that agents become the primary way customers interact with brands. It's a big claim, but the direction makes sense. If customers can just ask for what they need, the website and app become less central. That makes the stakes higher, as the agent becomes the company's primary representative.
Another step is customer service that generates revenue. Sierra's travel and hospitality demos make this concrete. A hotel agent doesn't only modify a booking. It can suggest an upgrade, book dinner and arrange transport. An airline agent does not only handle a cancellation. It can rebook the customer, apply loyalty preferences and offer a better seat using points. That's a different category of service. The agent is solving the problem and spotting the next useful commercial action.
A structural question is what the foundation-model providers do next. OpenAI has begun embedding its own forward-deployed teams inside enterprises – a sign the model makers want to sit closer to production, not just supply the intelligence. Though my understanding is that OpenAI is content for now to power the customer service platforms rather than compete with them. OpenAI supplies models behind both Sierra and Decagon and does not seem to be planning to enter their space directly. Whether that line holds, or the model providers move down the stack into the customer service layer themselves, is one of the more interesting open questions in the category.
Many of the interesting questions for AI customer service are management questions. Who checks that it is improving? Who decides when it can act autonomously? These questions matter. A bad agent will make a company feel colder. Success with customer service agents ultimately leads to customers retained and growth for the business. That is the prize.
Closing thoughts
Customer service agents are not the whole AI story. But they are one of the clearest places to see how this new technology is being applied.
Service becomes more personal.
Software priced closer to outcomes.
Teams spend more time managing systems that act.
The companies that learn how to manage these agents will gain a significant advantage. Like a sharp, fast, slightly literal new hire, an agent needs context, feedback and clear limits. Give it those things and it can do useful work.