![The 15-day pivot that saved a $1.3B company - Des Traynor [INTERCOM]](/_next/image?url=https%3A%2F%2Fimg.youtube.com%2Fvi%2FAW7I_pgJrbs%2Fhqdefault.jpg&w=1080&q=75)
The 15-day pivot that saved a $1.3B company - Des Traynor [INTERCOM]
AI Summary
Des Traynor, co-founder of Intercom, discusses the company's radical pivot to AI, which revitalized their growth and redefined their business. In 2023, Intercom faced stagnant growth and a declining customer support market. In response, they launched "Finn," an AI agent priced per resolved conversation, leading to a doubling of growth to 25% and $343 million in annual recurring revenue within 18 months.
The impetus for this rapid innovation was the launch of ChatGPT in late 2022. Traynor recounts how his head of AI shared ChatGPT, and an early interaction, where the CTO asked it how to install Intercom on a mobile app, demonstrated its potential. The AI provided a near-instant, perfect answer, surpassing human support team response times and offering 24/7 multilingual support. This highlighted the irreparable impact AI would have on customer service, even with initial concerns about hallucinations. Intercom predicted AI could handle the "easy" support tasks, which constitute about 60% of work for many organizations. At launch, Finn handled 23-24% of support inquiries, and now handles around 70%, with Intercom fully automating about 85% of its total customer support.
Intercom's SaaS business was in the hundreds of millions but had experienced post-COVID growth spikes followed by a 2022 decline. Finn's stratospheric growth has since eclipsed Intercom's previous rapid expansion. The company viewed Finn as a potentially cannibalistic product given its impact on help desk usage, but recognized it as essential for survival. Intercom operates as a single entity with Finn and its help desk as distinct products, with plans for more. While Finn is accessible to customers of other platforms like Zendesk and Salesforce, the Intercom integration is positioned as the premium version due to seamless tuning with their help desk.
The development of AI software, Traynor emphasizes, requires a fundamentally different approach than traditional SaaS. Traditional software development assumes certainty and deterministic outcomes, focusing on features like file uploads. AI, however, is characterized by uncertainty about possibilities and reliability. This necessitates an experimental approach, treating development as a series of experiments to achieve high-probability outcomes. Reliability is paramount, especially in mission-critical areas like customer support, where clear correct answers are expected. Intercom's iterative process, involving rigorous testing and monitoring, has been key to improving Finn's reliability from 23% to its current 67-68%. Many existing AI products fail because they apply old SAS development methodologies.
Traynor elaborates on the engineering required for reliable AI. He stresses that every aspect of software development must change: roadmaps, evaluation periods, product design, and the role of design itself. The old model involved gathering customer feedback, diagnosing feature requests, roadmapping, and building with reasonable certainty of delivery. The new AI world begins with exploring model capabilities, such as reliably determining refund eligibility. This involves creating "torture tests" with thousands of scenarios to evaluate AI performance against human quality.
For example, in a refund scenario, Intercom tests the AI's ability to apply fuzzy policies, like granting a refund if a customer seems legitimate and hasn't abused the system. This experimentation happens in an "AI lab" before productization. Once a capability is deemed reliable, productization begins, involving designers and product managers to build the surrounding infrastructure. Unlike traditional SAS, where launch is the endpoint, AI products require continuous monitoring and iteration. Intercom tests millions of scenarios post-launch to assess customer configuration, guidance accuracy, and real-world performance. They develop instrumentation and telemetry to track metrics and identify areas for improvement.
Traynor contrasts this with hypothetical AI applications, like an expense tracking agent. Companies might build a demo with prototypical receipts, which perform well, but then fail with real-world, messy data, leading to customer disappointment and a damaged brand reputation. He advises against hyping AI capabilities before they are proven reliable.
Regarding model updates, Traynor cautions against companies immediately adopting the latest models without rigorous evaluation. Intercom conducts extensive testing, comparing new model outputs against current versions and human-written "perfect" answers. However, he notes that most improvements to Finn haven't come from new models but from better AI architecture, disambiguation, prompting, and the strategic use of different models. Intercom also post-trains its own models for specific customer experience aspects, involving a full re-evaluation of the system when changes are made.
Defining "good" in AI is crucial. For customer support, it means providing the correct answer, as determined through scenario testing and comparison against ideal responses. For less defined tasks like LinkedIn post generation, "good" can be proxied by user acceptance rates or engagement metrics, though these are less direct. Intercom uses internal definitions of perfect answers and a diverse set of scenarios to ensure the AI performs well across different user types and use cases.
The shift to AI has transformed Intercom into an R&D-heavy company, moving from predictable SAS development to exploring unknown capabilities. Traynor explains that the decision to go "all in" on AI was driven by the existential threat it posed to businesses reliant on traditional help desks. He uses the analogy of Netflix vs. Blockbuster, where Blockbuster's failure to adapt to online streaming led to its demise. He advocates for founders to approach AI by assuming nothing exists and reimagining their product from scratch with AI at its core, integrating it wherever it can be made reliable and performant. This often requires reallocating resources from existing business areas and accepting potential customer churn.
Traynor acknowledges that a company's existing financial health can influence its decision-making speed. Intercom was already undergoing significant restructuring, which facilitated the AI pivot. However, he believes that even with strong growth, a company would be remiss not to invest heavily in AI if it poses an existential threat. He notes that many public SaaS companies are currently struggling because they haven't successfully transitioned to AI, impacting their valuations and long-term viability.
Addressing the risk of AI hallucinations, Traynor explains that Intercom relies on GPT-4 and an "actor-critic" technique. This involves a "red team" challenging every statement, demanding citations or logical grounds. Answers are evaluated at a per-unit level, similar to scientific papers. They employ retrieval-augmented generation (RAG) to source facts from multiple data sources and then infer answers. This includes abstracting concepts, like defining a "car wash dealership" to determine if a product like Riverside FM would be suitable. The inference process is then rigorously interrogated for material correctness and grounded logic. They also use "torture test" scenarios to tempt the AI into making unjustified assumptions and ensure it doesn't hallucinate or go off-topic.
From a business model perspective, Intercom shifted to a per-resolution pricing model, which Traynor believes is better aligned with the value delivered. The 99-cent per resolution price was determined by calculating the cost of human support, aiming for a transformative impact. It also considers the cost of customer seats, ensuring the AI solution is a no-brainer. He anticipates a broader industry trend towards outcome-based pricing, especially where clean outcomes can be defined, noting that a reluctance to adopt such models can be a "bad smell" indicating a lack of product confidence. For categories without clear outcomes, usage-based pricing or marked-up tokens might be more appropriate.
Regarding the CRM space, Traynor sees a significant evolution towards contextual, AI-powered platforms that provide summaries and insights rather than just key-value pairs. He cites Gong as an example of a company effectively presenting AI-generated summaries of sales calls. The future of CRM, he believes, will be about higher levels of knowledge and AI-powered capabilities acting as both servers and clients, drawing inferences and flagging potential issues. Intercom's focus is on developing Finn as a comprehensive "customer agent" capable of handling inbound sales, customer success, and delivering outcomes, while remaining CRM agnostic.
On the future of customer service jobs, Traynor predicts a reduction in frontline roles due to AI's capabilities. However, he sees new, exciting roles emerging for individuals skilled in training, managing, and leveraging AI for customer experience strategies. The emphasis will shift from traditional operational skills to conversational design, automation, and prompt engineering. He emphasizes that while there will be fewer jobs, the remaining and new roles will be higher impact.
His advice to founders is to deeply trust an AI expert to assess how much human involvement will be needed in their future product. The wrong approach is to add AI as a small feature; instead, founders should reimagine their product from scratch, assuming AI can handle most tasks, and integrating traditional UI only where human involvement is essential. This requires a clear vision, charting a course from the current state, potentially reallocating resources, and overcoming internal inertia and resistance. He stresses the urgency, suggesting founders have only a few years left to adapt before it's too late, especially in enterprise B2B software.