Today we're rolling out the Nuxt Agent (Beta) on nuxt.com: a first-party AI agent that lives right inside the site, deeply wired into our documentation, modules catalog, blog, deployment guides, and the wider Nuxt ecosystem. It replaces the third-party Kapa AI widget we used to ship, and takes the experience from "answer a question" to "help me get things done".
From a docs widget to a first-party agent
For the past couple of years, the Kapa AI widget served as our docs Q&A experience. It did one thing well: search the docs and summarize an answer. But the Nuxt experience is more than just docs. It's modules, templates, deployment providers, the changelog, GitHub issues, playgrounds, and real navigation into the site.
We wanted something that could do all of that in one place, with the same design language as the rest of nuxt.com, the same content pipeline (Nuxt Content), and the same infrastructure we already run. So we built our own, on top of the Nuxt MCP server we shipped last November.
The result is an agent that:
- Grounds every answer in the official Nuxt documentation and ecosystem data, through structured MCP tools rather than retrieved text chunks.
- Renders rich UI: modules, templates, blog posts, hosting providers, and playground links come back as cards you can click, not as plain links buried in prose.
- Streams everything end to end, with tool call progress visible as it runs.
- Owns the loop: feedback, voting, and issue reporting flow directly into our internal tools so we can improve the agent over time.
Meet the agent
Three ways to talk to it
The agent is available everywhere on nuxt.com:
- As a side panel: pinned to the right on large screens (a
USidebarwith a rail), and as a slide-over on smaller screens. Toggle it from the header or with ⌘I. - As a floating ask bar: on
/docsand/blogpages, a bottom-centered "Ask anything…" input lets you jump into the agent without taking your eyes off the page. - As a full-screen chat at /chat: a dedicated layout for longer sessions, with the full conversation persisted across reloads.
It knows what page you're on
If you're reading a doc page and ask "how do I customize this for my app?", the agent already knows which page you mean. User messages are prefixed with a [Page: /docs/...] hint when relevant, and the model treats that as context, not as a command. You'll see a "Agent is using this page" indicator in the footer, which you can dismiss any time.
Rich answers, not just text
The agent doesn't just talk — it shows. When you ask about a module, it renders a module card with metadata pulled live from api.nuxt.com. Ask for starter templates and you'll get clickable template cards. Ask about deployment and you'll get provider cards linking to the right guide. Need to reproduce a bug? The agent can generate a StackBlitz playground link straight from the conversation.
nuxt-ui-dashboard, nuxt-ui-saas, nuxt-ui-landing, nuxt-ui-chat, nuxt-ui-docs, nuxt-ui-portfolio) as cards you can open in one click.Suggested questions to get you unstuck
When you open the agent fresh, you're greeted with suggested prompts grouped by category — Getting Started, Features, and Deploy & Explore — driven by our app.config.ts. They're the questions we see most often, and a good starting point if you don't know where to begin.
Feedback is a first-class citizen
Every assistant message can be voted up or down with a thumb. And if the agent can't help — either because it ran out of ideas or because you hit a rough edge — the Report issue action opens a short form, then files a Linear issue on our side with the full conversation attached. No copy-pasting, no screenshotting. We see every one of them.
Persistent, resumable chats
Conversations are stored per user and resumed across reloads, so you can step away and pick up where you left off. Each chat has a stable chatId passed through a custom header on every request.
What the agent can actually do
Under the hood, the agent has access to two families of tools.
Grounded in Nuxt content (via MCP)
These tools come from our own Nuxt MCP server, the exact same server that any external AI assistant (Cursor, Claude Desktop, ChatGPT, etc.) can connect to. That means the Nuxt Agent and your local AI assistant are looking at the same structured data:
list_documentation_pages/get_documentation_page— browse and read the official docs, with optional H2-level section filtering to stay token-efficient.get_getting_started_guide— fetch the getting-started content on demand.list_blog_posts/get_blog_post— read through announcements and articles.list_modules/get_module— discover and inspect modules from the catalog.list_deploy_providers/get_deploy_provider— surface the right deployment guide for your target.get_changelog— fetch releases directly from the official Nuxt repositories.
Rich UI tools
On top of the MCP grounding, the agent has a second set of tools that render directly as UI inside the chat:
show_module— module card from the nuxt.com modules API.show_template— template cards (accepts multiple slugs at once).show_blog_post— a blog post card.show_hosting— a deployment provider card.open_playground— a StackBlitz link builder.search_github_issues— searches acrossnuxt,nuxt-modules, andnuxt-contentorgs; the agent uses this first when you paste an error.report_issue— triggers the in-chat feedback form.
And the web, when it's actually useful
Finally, the agent has Anthropic's native web_search tool available — but only for things genuinely outside the Nuxt universe (recent events, third-party services, etc.). The system prompt makes this clear: no proactive web searching.
Under the hood
Time to get technical.
The stack
┌──────────────────────────┐ ┌────────────────────────┐
│ @ai-sdk/vue Chat │ │ POST /api/agent │
│ DefaultChatTransport │────▶ │ AI SDK v6 streamText │
└──────────────────────────┘ │ (UIMessage streaming) │
└────────────┬───────────┘
│
┌────────────────────────────┼────────────────────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────────┐ ┌─────────────────────┐
│ Anthropic │ │ Nuxt MCP Server │ │ Native tools │
│ claude-sonnet-4.6│ │ (same-origin /mcp) │ │ show_*, open_*, etc. │
│ + web_search │ │ docs, modules, blog, │ │ │
│ │ │ deploy, changelog │ │ │
└──────────────────┘ └──────────────────────┘ └─────────────────────┘
- Model:
anthropic/claude-sonnet-4.6, accessed through the Anthropic provider. - Server: a single Nitro event handler at
server/api/agent.post.ts. - Client:
@ai-sdk/vueChatinstance with aDefaultChatTransportpointing at/api/agent. - State: a shared
useNuxtAgentcomposable driving the side panel, the floating bar, and the full-screen page. - Storage: Drizzle ORM, with one row per chat, accumulating token usage, estimated cost, and duration over the lifetime of the conversation.
- Observability:
evlogwraps the model with structured AI telemetry (tokens, cost, tool calls).
UIMessage streaming with AI SDK v6
The whole pipeline is built on the AI SDK v6 UIMessage streaming model. On the server:
const stream = createUIMessageStream({
execute: async ({ writer }) => {
const result = streamText({
model: ai.wrap(MODEL),
maxOutputTokens: 4000,
stopWhen: stopWhenResponseComplete,
system: systemPrompt,
messages: await convertToModelMessages(messages),
tools: {
...mcpTools as ToolSet,
web_search: anthropic.tools.webSearch_20250305(),
search_github_issues: createSearchGitHubIssuesTool(event),
show_module: showModuleTool,
show_template: createShowTemplateTool(event),
show_blog_post: createShowBlogPostTool(event),
show_hosting: createShowHostingTool(event),
open_playground: openPlaygroundTool,
report_issue: reportIssueTool
},
experimental_telemetry: {
isEnabled: true,
integrations: [createEvlogIntegration(ai)]
}
})
writer.merge(result.toUIMessageStream({
sendSources: true,
originalMessages: messages,
onFinish: ({ messages: finalizedMessages }) => {
event.waitUntil(saveChat(finalizedMessages))
}
}))
}
})
A few details worth calling out:
stopWhen: stopWhenResponseCompleteis a small custom predicate that stops the loop as soon as the model produces text without any new tool calls, with a hard ceiling of 10 steps. This avoids the classic "model loops forever on tools" failure mode while still allowing multi-step tool chains.sendSources: trueforwards source metadata to the client so citations can be rendered inline.event.waitUntil(saveChat(finalizedMessages))pushes chat persistence outside the response lifecycle — the stream finishes for the user immediately, and the chat record is upserted asynchronously.
One MCP server, two consumers
The most important architectural decision is that the agent and external AI assistants talk to the same MCP server. The route handler opens an HTTP MCP client pointed at its own /mcp endpoint:
const mcpUrl = import.meta.dev
? `http://localhost:3000${MCP_PATH}`
: `${getRequestURL(event).origin}${MCP_PATH}`
const httpClient = await createMCPClient({
transport: { type: 'http', url: mcpUrl }
})
const mcpTools = await httpClient.tools()
Those tools are then merged with the native UI tools into a single tools object passed to streamText. Concretely:
- Every tool you can use from Cursor or Claude Desktop against the Nuxt MCP server, the Nuxt Agent can also use.
- Any new MCP tool we add to the server becomes instantly available to the agent without a separate wiring step.
If you want to dive deeper into how the MCP server itself is built, we wrote a dedicated blog post back in November.
Persistence, usage, and cost tracking
Chats are persisted in a single agent_chats table, keyed by the x-chat-id header the client sends on every request. The onConflictDoUpdate pattern accumulates token usage and cost across the lifetime of the chat:
await db.insert(schema.agentChats).values({ /* ... */ })
.onConflictDoUpdate({
target: schema.agentChats.id,
set: {
messages: finalizedMessages,
inputTokens: sql`${schema.agentChats.inputTokens} + ${inputTokens}`,
outputTokens: sql`${schema.agentChats.outputTokens} + ${outputTokens}`,
estimatedCost: sql`${schema.agentChats.estimatedCost} + ${estimatedCost}`,
durationMs: sql`${schema.agentChats.durationMs} + ${durationMs}`,
requestCount: sql`${schema.agentChats.requestCount} + 1`,
updatedAt: new Date()
}
})
This gives us per-chat usage analytics at zero runtime cost, and a single source of truth for resumption.
Rate limiting and abuse protection
Every request goes through a consumeAgentRateLimit helper before streaming starts. The current limit is 20 messages per day per IP fingerprint — enough for real use, low enough to prevent runaway costs from accidental (or intentional) loops.
A tight system prompt
A lot of agent quality comes from the prompt. Ours emphasizes:
- Identity: the agent is the Nuxt Agent, not a generic chatbot. It's confident, precise, and grounded.
- Token efficiency:
get_documentation_pageandget_blog_postmust always be called with asectionsparameter;show_moduleis preferred overget_modulefor UI answers. - Debugging flow: when the user shares an error, hit
search_github_issuesfirst — acrossnuxt,nuxt-modules, andnuxt-content— before anything else. - No accidental web search: only on explicit user request or as a last resort.
- Always respond with text: never end a turn with just a tool call.
These rules dramatically reduce tool-spam and hallucinations, and make the agent feel like it actually knows what it's doing.
What's next
The agent is launching in Beta, and there's a lot we still want to do: richer memory across sessions, more tools (playgrounds with custom dependencies, interactive module configuration, Nuxt DevTools integration), smarter source citations, and better coverage of the broader ecosystem.
We'd love your help shaping it. If the agent gets something wrong, or misses a feature you'd want, use the Report issue button right inside the chat — it creates a ticket on our side with the full conversation attached, and we read every one.
The complete source code for nuxt.com — including the agent, the MCP server, and all the tools described above — is available on GitHub. The agent handler lives at server/api/agent.post.ts, the native tools at server/utils/tools/, and the UI components at app/components/agent/. Feel free to use it as inspiration for your own apps — and if you want to build your own MCP server, the Nuxt MCP Toolkit makes it a few-minute job.