The problem

Your AI assistant burns through multiple attempts on simple requests like “let’s discuss this PR.” First it tries to fetch the PR but forgets GitHub’s query syntax. Then it asks permission to retry. It gets the PR but needs another request for review comments—and fails again because it mixed up the API parameters. You’re watching it stumble through what should be one smooth conversation.

The insight

MCP currently follows a “smart host, dumb servers” pattern—one AI learns every API across all its server connections. Instead, flip to “orchestrating host, smart servers.” Put specialist agents inside MCP servers and let your host delegate through natural language.

(See discussion that sparked idea)

An example

Current approach:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
get_pull_request(
    owner="modelcontextprotocol",
    repo="python-sdk",
    pull_number=42
)

get_pull_request_comments(
    owner="modelcontextprotocol",
    repo="python-sdk",
    pull_number=42
)

With server-side agents:

  • Host: “Let’s discuss PR #42 in the Python SDK”
  • GitHub agent: “Got the PR and 8 review comments. The main discussion is about error handling. Want a summary?”
  • Host: “Yes, focus on the unresolved threads”

Why this matters

MCP’s sampling capability enables real dialogue between server agents and the host LLM. Your AI maintains a simple rolodex of specialists instead of memorizing hundreds of API signatures. Each specialist handles complexity internally—rate limits, parameter validation, clarification requests.

Instead of perfect API calls, you get natural delegation.