MCP Apps - Bringing UI Capabilities To MCP Clients
Key Points
- 1MCP Apps are a new official extension for the Model Context Protocol (MCP) enabling tools to return interactive UI components like dashboards, forms, and visualizations directly within conversational interfaces.
- 2They function by allowing tools to declare a UI resource that is rendered in a sandboxed iframe, facilitating bidirectional communication between the UI and the host to overcome the limitations of plain text responses for complex interactions.
- 3This standardization provides developers with a robust framework for creating rich, dynamic agentic experiences across diverse clients like ChatGPT, Claude, and VS Code, enhancing user interaction with AI agents.
MCP Apps represent an official extension of the Model Context Protocol (MCP), enabling tools to return rich, interactive UI components directly within conversation interfaces, rather than solely plain text. This allows for the rendering of elements such as dashboards, forms, visualizations, and multi-step workflows. This standard builds upon the foundational work of MCP-UI and the OpenAI Apps SDK, unifying previous patterns into a production-ready, open standard.
The core methodology of MCP Apps relies on two key MCP primitives:
- Tools with UI metadata: Tools include a
_meta.ui.resourceUrifield within their definition, which points to a specific UI resource. For example:
{
name: "visualize_data",
description: "Visualize data as an interactive chart",
inputSchema: { /* ... */ },
_meta: {
ui: {
resourceUri: "ui://charts/interactive"
}
}
}- UI Resources: These are server-side resources served via a dedicated
ui://scheme, which contain bundled HTML and JavaScript.
Upon receiving a tool call with the _meta.ui.resourceUri, the host client fetches the specified UI resource. This resource is then rendered within a sandboxed iframe. Bidirectional communication between the UI content inside the iframe and the host client is established via JSON-RPC over the postMessage API. This architecture overcomes the limitations of text-only interactions, which create a "context gap" for user exploration and dynamic interaction (e.g., sorting data, filtering, or real-time monitoring). MCP Apps provide live updates, native media viewers, persistent states, and direct manipulation, making interactions feel like using a standard web application.
Developers build new MCP Apps using the @modelcontextprotocol/ext-apps package, which provides an App class for UI-to-host communication. Key methods within this API include:
app.connect(): Establishes the connection to the host.- : A callback function to receive tool results from the host, enabling the UI to render data dynamically.
app.callServerTool({ name: "tool_name", arguments: { ... } }): Allows the UI to invoke server-side tools, facilitating multi-step workflows or fetching additional data based on user interaction.app.updateModelContext({ content: [{ type: "text", text: "..." }] }): Enables the UI to update the model's context, providing the underlying agent with real-time feedback on user actions or selections, driving the conversation forward or quietly updating state for later.
The security model for MCP Apps is multi-layered to mitigate risks associated with running external UI code:
- Iframe Sandboxing: All UI content runs in highly restricted sandboxed iframes, limiting potential malicious actions.
- Pre-declared Templates: Hosts can review and approve HTML content before it is rendered, adding a layer of pre-emptive security.
- Auditable Messages: All UI-to-host communication occurs via loggable JSON-RPC messages, providing transparency and an audit trail.
- User Consent: Hosts can require explicit user approval for UI-initiated tool calls, giving users control over actions taken on their behalf.
MCP Apps is designed for broad client compatibility, with support already available in platforms like Claude, Goose, Visual Studio Code, and ChatGPT. This standardization allows tool developers to create interactive experiences that function across a diverse range of clients without requiring client-specific code.