Skip to content

Frontend-Triggered Queries

By default, MCP can only be used for communicating from an AI agent to the tool-offering server. However, when building AI-native web apps, one often wants to trigger queries from the frontend app.

While one could simply build a custom AI agent, wouldn't it be nice to use the same MCP tools so that queries can both be triggered from an AI agent (like Claude Desktop) and the frontend app itself?

To support this, MCP-Web offers the ability to trigger queries using a lightweight routing approach to an AI agent server.

Relationship to MCP Sampling

MCP-Web's frontend-triggered queries serve a similar purpose to MCP Sampling — both allow server-side code to request LLM completions. However, there are key differences:

How They Compare

AspectMCP-Web QueriesMCP Sampling
DirectionFrontend → Agent Server → LLMMCP Server → MCP Client (e.g., Claude Desktop) → LLM
Human-in-the-loop❌ No approval required✅ User must approve each request
API KeyRequired on agent serverNot needed (uses client's LLM access)
ArchitectureRequires separate agent serverDirect server→client request

When to Use Each

Use MCP-Web Queries when:

  • You need automated, uninterrupted AI interactions (games, real-time features)
  • User approval for each AI call would disrupt the experience
  • Example: The Checkers demo where the AI opponent needs to make moves without user approval for each turn

Use MCP Sampling when:

  • Human oversight is desirable or required for trust & safety
  • You want to avoid managing LLM API keys on your server
  • The user expects to review AI actions before they execute
  • Example: An AI assistant that drafts emails—users may want to review before sending

Coming Soon

MCP-Web will add support for MCP Sampling, giving you the choice between automated queries (current approach) and human-in-the-loop sampling (MCP standard). This will enable scenarios where user approval adds value rather than friction.

MCP-Web Queries vs. Custom Protocol + Agent

MCP-Web's frontend-triggered queries ultimately serve the same purpose as directly calling an LLM provider or an AI agent, so it's worth taking a moment to consider which approach would be best for your use case.

✅ Use MCP-Web's queries when…

…you want tool unification. I.e., your app already exposes MCP tools using MCP-Web and you want frontend-triggered AI queries to use those same tools. Or your AI agent already uses a bunch of other MCP tools and you want to apply the same tool discovery and calling protocol.

For instance, imagine building a todo app where you want an AI agent to be able to add todos from meeting notes in Google Drive while at the same time being able to drop meeting notes into the frontend app to be turned into todos.

❌ Use a custom protocol when…

…you don't need to expose MCP tools or you have custom AI workflows that don't fit well into MCP.

For instance, a simple chat interface that only needs AI completions without any tool calls. Directly calling your LLM provider is simpler than MCP-Web's query routing.

What You Need

1. Agent Server

Frontend queries require an agent server. This server needs to handle at least two endpoints:

  • PUT /query/{uuid} for new queries
  • DELETE /query/{uuid} for canceling queries

Query Request Format

When your agent server receives a PUT /query/{uuid} request, the body contains:

typescript
{
  uuid: string;              // Unique query identifier
  prompt: string;            // User's instruction to the AI
  context?: Array<{          // Optional context items
    name: string;
    value: unknown;          // Pre-computed value (for tool context)
    schema: JSONSchema;      // JSON Schema describing the value
    description?: string;
    type: 'ephemeral' | 'tool';
  }>;
  responseTool?: {           // Optional: require AI to call this tool
    name: string;
    inputSchema: JSONSchema;
    outputSchema?: JSONSchema;
  };
}

Your agent server should stream events back to the frontend via the WebSocket connection. See Query Events Reference for the event types your server should emit.

Complete Example

See the Checkers Demo for a complete agent server implementation including query handling, tool routing, and LLM integration.

2. Configure Agent URL

Add agentUrl to your MCPWeb config:

typescript
export const MCP_WEB_CONFIG = {
  name: 'My App',
  description: 'AI-controllable app',
  host: 'localhost',
  wsPort: 3001,
  mcpPort: 3002,
  agentUrl: 'http://localhost:3003',  // Required for queries
};

Custom Query Endpoint (Optional)

By default, when the agentUrl does not contain any path, queries are routed to /query. You can customize the routing by including a custom path in the agentUrl. In either case, a query UUID will be appended to the URL:

typescript
export const MCP_WEB_CONFIG = {
  name: 'My App',
  description: 'AI-controllable app',
  agentUrl: 'http://localhost:8000/api/v1/query',
};

Custom routing is useful when integrating with existing APIs that use different routing conventions.

Examples:

  • Default: agentUrl: 'http://localhost:3000' → queries sent to http://localhost:3000/query/{uuid}

  • Custom: agentUrl: 'http://localhost:8000/api/v1/query' → queries sent to http://localhost:8000/api/v1/query/{uuid}

The agent server will receive:

  • PUT {agentUrl}/{uuid} for new queries
  • DELETE {agentUrl}/{uuid} for canceling queries

3. Call mcp.query()

Last but not least, issue a query using mcp.query() in your frontend app:

typescript
const query = mcp.query({
  prompt: 'Your instruction to the AI',
  context: [/* optional context */],
  responseTool: /* optional tool */,
  timeout: 300000, // optional, default 5 minutes
});

Simple Example

In the simplest case, send a prompt and await the query result. 🎉

typescript
async function summarizeTodos() {
  const query = mcp.query({
    prompt: "Analyze my today's todos and summarize them"
  });

  try {
    return await query.result;
  } catch (error) {
    console.error('Query failed:', error);
  }
}

Add Context to Query

For many frontend-triggered queries, you might know in advance, which tools provide essential context. To improve the performance, you can pass useful context directly to the AI agent using the context prop.

For instance, for the above example query, the AI agent needs to know what today's todos are to summarize them. You could let the AI agent discover a tool to retrieve today's todos but that adds latency. It'd be faster to directly expose the today's todos tool as context.

typescript
const getTodaysTodosTool = mcp.addTool({
  name: 'get_todays_todos',
  handler: () => getTodaysTodos(),
  outputSchema: TodosSchema
});

async function summarizeTodos() {
  const query = mcp.query({
    prompt: "Analyze my today's todos and summarize them",
    context: [getTodaysTodosTool],
  });

  try {
    return await query.result;
  } catch (error) {
    console.error('Query failed:', error);
  }
}

There are two types of context you can pass along a query:

  1. Getter tool definition
  2. Ephemeral information

For getter tool definition context, the tool's current value is pre-computed prior to issuing the query. I.e., the AI agent will immediately have the value of that tool call without needing to call this tool.

Ephemeral information could be anything that might be useful to complete the query. The difference to tool context is that this information does not correspond to a tool.

TIP

Since tool definition context already pre-computes the current tool value, it makes sense to hide those tools from the AI agent, as calling them twice would be wasteful.

Streaming Results

For more involved queries, it can be useful to show intermediate progress updates. You can do this by using query.stream, which returns an async iterator.

Framework-Agnostic Pattern

This example uses placeholder state setters like setProgress() and setSummarizing(). Implement these based on your framework (React useState, Svelte stores, Vue refs, etc.). The key pattern is the for await loop over query.stream.

typescript
async function summarizeTodos() {
  setSummarizing(true);

  const query = mcp.query({
    prompt: "Analyze my today's todos and summarize them",
    context: [getTodaysTodosTool],
  });

  try {
    for await (const event of query.stream) {
      switch (event.type) {
        case 'query_accepted':
          setProgress('AI is getting to work');
          break;

        case 'query_progress':
          setProgress(event.content || 'Processing...');
          break;

        case 'query_complete':
          setProgress('Complete!');
          return event.result;

        case 'query_failure':
          setProgress('AI failed');
          showError(event.error);
          break;
      }
    }
  } finally {
    setSummarizing(false);
  }
}

Query Events Reference

The following events are streamed from the agent server during query processing:

Event TypeDescriptionProperties
query_acceptedQuery was received and queued for processinguuid
query_progressIntermediate progress updateuuid, content (optional message)
query_completeQuery finished successfullyuuid, result
query_failureQuery faileduuid, error (error message)
query_cancelQuery was cancelleduuid

Event structure:

typescript
// Accepted
{ type: 'query_accepted', uuid: string }

// Progress (can be sent multiple times)
{ type: 'query_progress', uuid: string, content?: string }

// Complete
{ type: 'query_complete', uuid: string, result: unknown }

// Failure
{ type: 'query_failure', uuid: string, error: string }

// Cancelled
{ type: 'query_cancel', uuid: string }

Enforce a Response Tool

Sometimes you want the result of a query to be a tool call. You can enforce this by specifying the responseTool property. This is very useful for structured queries.

typescript
const makeMoveTool = mcp.addTool({
  name: 'make_move',
  description: 'Make a move on the game board',
  handler: (move) => { applyMove(move) },
  inputSchema: z.object({ from: PositionSchema, to: PositionSchema }),
});

async function makeMoveAI() {
  const query = mcp.query({
    prompt: 'Analyze the board and make your move',
    // AI agent must call this tool to complete the query
    responseTool: makeMoveTool,
  });

  try {
    await query.result;
  } catch (error) {
    console.error('AI failed to make a move:', error);
  }
}

Cancel a Query

You can cancel a query using either the return value's cancel() function or with an AbortController.

Using the query.cancel() function

typescript
let currentQuery: QueryResponse | null = null;

async function startAnalysis() {
  currentQuery = mcp.query({
    prompt: 'Perform deep analysis of the dataset',
    timeout: 600000, // 10 minutes
  });

  for await (const event of currentQuery.stream) {
    // Process events...
  }
}

function cancelAnalysis() {
  if (currentQuery) {
    currentQuery.cancel();
    currentQuery = null;
  }
}

Using AbortSignal

Using the AbortSignal to cancel a query can be useful for auto-cancelling queries (e.g., if they run too long) or when you want to delegate the cancellation without passing the entire query response object.

typescript
async function runWithTimeout() {
  // Auto-cancel after 30 seconds
  const abortController = new AbortController();
  setTimeout(() => abortController.abort(), 30000);

  const query = mcp.query({
    prompt: 'Quick analysis needed',
  }, abortController.signal);

  try {
    for await (const event of query.stream) {
      if (event.type === 'query_complete') {
        return event.result;
      }
    }
  } catch (error) {
    if (error.name === 'AbortError') {
      console.log('Query was cancelled');
    }
  }
}