How We Turned Claude Into a Cross-System Support Investigator
Support triage at Mergify meant juggling Datadog, Sentry, PostgreSQL, Linear, and source code. We built a repo with MCP servers and Claude Code that investigates tickets in parallel — cutting triage from 15 minutes to under 5, with 75% first-pass accuracy.
Support in a B2B infrastructure company is distributed debugging. At Mergify, every engineer is involved in support. When a customer reports an issue, we need to answer fast, accurately, and with confidence. But the investigation surface is fragmented:
- Logs in Datadog
- Errors in Sentry
- Production read-only PostgreSQL
- Source code in multiple repositories
- Existing tickets in Linear
- Customer conversation in our support platform
Each system has context. None of them shares it.
The result used to look like this:
- Open ticket
- Identify customer org
- Search logs
- Search Sentry
- Check DB state
- Look at code
- Check if there’s already a Linear issue
- Build a timeline
- Draft a response
That was 10 to 15 minutes of focused work per ticket, sometimes more. We reduced that to 2-5 minutes of background time using Claude Code + MCP.
This is the architecture.
The Idea: Turn Claude Into a Cross-System Investigator
Instead of building another internal dashboard, we built a GitHub repository.
That repository is an investigation surface for Claude Code.
Inside it:
- All company repositories are linked as Git submodules under
src/ - A static
.mcp.jsondeclares external systems - A
mcp/directory hosts local MCP servers - Slash commands encapsulate production SQL workflows
- A
CLAUDE.mdfile encodes our full support runbook
Engineers run claude then paste a support ticket URL. Claude performs a first-pass investigation, generates a structured timeline, suggests a root cause hypothesis, and drafts a response. If relevant, it proposes or creates a Linear ticket with a plan.
Architecture
MCP Layer: Unified Tooling via Official Protocol
We use the MCP protocol via @modelcontextprotocol/sdk in TypeScript. There are two types of servers:
1. Remote HTTP MCP (Vendor Hosted)
- Sentry
- Linear
Configured in .mcp.json:
{
"mcpServers": {
"sentry": {
"type": "http",
"url": "<https://mcp.sentry.dev/mcp>"
},
"linear": {
"type": "http",
"url": "<https://mcp.linear.app/mcp>"
}
}
}
Claude connects directly. OAuth handled by vendor. No local code required.
2. Local stdio MCP Servers (Thin Wrappers)
We wrote two local MCP servers:
- Plain (support platform)
- Datadog
Each is a Node.js TypeScript file executed via tsx. Claude Code spawns them as subprocesses using stdio transport.
They are stateless wrappers around vendor SDKs:
@team-plain/typescript-sdk@datadog/datadog-api-client
Example tool definition:
mcp.tool(
"search_logs",
"Search Datadog logs by query",
{
query: z.string(),
from: z.string().default("now-1h"),
to: z.string().default("now"),
limit: z.number().int().min(1).max(1000).default(25),
},
async ({ query, from, to, limit }) => {
// call Datadog API
}
);
Key characteristics:
- Zod schema → auto-converted to JSON Schema
- Stateless API calls
- No in-memory persistence
- Pagination via cursors returned to caller
Auth is environment-driven:
"env": {
"DD_API_KEY": "${DD_API_KEY}",
"DD_APP_KEY": "${DD_APP_KEY}"
}
If env vars are missing, servers exit early. CLAUDE.md instructs operators how to fix it.
No dynamic tool registration. Entire .mcp.json is static and versioned.
Claude Code Integration
We use Claude Code CLI.
The repo follows Claude Code conventions:
.mcp.json.claude/settings.json.claude/commands/CLAUDE.md
When you run claude, it:
- Loads MCP config
- Spawns local servers
- Connects to remote MCP endpoints
- Loads
CLAUDE.md - Enforces tool permissions “
CLAUDE.md: The Investigation Brain
This file is more than a system prompt. It is:
- A purpose statement
- A repository map
- A self-healing setup procedure
- A full 5-step support runbook
The investigation section enforces parallelism:
Wave 1 — all in parallel:
- Production database via /pr-event-log
- Linear search for related tickets
- Sentry searches (time-window, customer tag, keyword, resolved issues)
- Datadog log search
Then:
Wave 2:
- Source code inspection
- GitHub status check
Claude is instructed to maximize the number of independent background agents. And this matters: sequential thinking kills triage speed. Parallel exploration compresses time-to-signal.
Production Database: Read-Only but Powerful
Claude can query production. The connection path is: Claude → Bash → query-prod.sh → Cloud SQL Proxy → PostgreSQL
There is no persistent connection. The script:
- Starts ephemeral Cloud SQL proxy
- Runs
psql -c "$QUERY" - Tears down the connection
Claude constructs free-form SQL. There is no run_sql() MCP tool. However, we set some guardrails:
Layer
Mechanism
DB role
Strict read-only IAM role
Auth
Short-lived gcloud tokens
Script whitelist
Only query-prod.sh allowed
Prompt guardrails
LIMIT required, no SELECT *, time filters
Timeouts
30s proxy timeout + DB query timeout
The hard boundary is infra-level read-only. Everything else reduces blast radius.
Slash commands like /pr-event-log embed templated SQL patterns for common investigations. This reconstructs:
- Event timeline
- Check runs
- Queue state
- Speculative merges
Claude merges that with logs and errors to produce a unified timeline.
End-to-End Flow
When an engineer pastes a ticket URL, the following happens:
- Claude fetches thread via Plain MCP
- Extracts metadata:
- Organization
- Repository
- PR number
- Timestamps
- Launches Wave 1 in parallel:
- SQL timeline
- Sentry searches
- Datadog logs
- Linear related tickets
- Synthesizes findings
- If needed:
- Inspects engine code under
src/ - Checks GitHub outage status
- Generates:
- Structured investigation notes
- Hypothesis
- Suggested customer response
- Linear issue draft with plan
Measurable Impact
- Triage time: from 10–15 minutes focused → 2–5 minutes mostly background
- First-pass accurate diagnosis: ~75% (based on how often the engineer’s final response matches Claude’s initial hypothesis)
- Team adoption: 100%
- Build time: ~10 hours spread over a week
- ROI: positive within weeks
The biggest gain is not speed. It is cognitive load reduction.
Instead of juggling five tabs, the engineer reviews a synthesized narrative.
The Hard Part: Confident Wrong Leads
Claude’s main failure mode is assuming causality. An example pattern might be:
- Sees error in Sentry
- Sees a log spike
- Correlates with the customer report
- Concludes root cause
Sometimes it is correlation, not causation. To detect such cases, we use:
- Human intuition
- System knowledge
- Cross-check inconsistencies
AI is very good at being convincingly wrong across multiple systems. This is why:
- Write operations are not fully auto-approved
- Linear issue creation may require operator approval
- Engineers validate before responding
The system accelerates investigation. It does not replace judgment.
Why This Worked
There are three major reasons that made this new system work:
1. We Encoded the Runbook
Support expertise lived in engineers’ heads: now it lives in CLAUDE.md.
- Parallel search rules.
- Specific Sentry query shapes.
- Mandatory time-window scans.
- GitHub outage checks.
This standardization alone would have improved quality. Claude just executes it faster.
2. MCP as a Clean Abstraction
MCP gave us:
- Unified tool discovery
- Schema validation
- Clear permission control
- Separation between remote and local systems
MCP avoids using custom protocols, gluing microservices, and having to manage a persistent backend. It just leverages a simple GitHub repository.
3. Low Build Cost
Building the entire repository took around 10 hours. Thanks to Claude Code, development ran in parallel with other work. The MCP servers were thin wrappers.
The key insight: you don’t need a big AI platform to get leverage. You need integration depth.
What’s Next: Autonomous Triage
Today triage is interactive: engineer runs Claude → gets investigation.
Our next step would be:
-
Background worker triggered on ticket creation
-
Automatically generates an investigation note
-
Stores:
-
Compressed reasoning trace
-
Structured investigation graph
- Attaches context to support the platform
When the engineer reads the ticket, the investigation is already there.
This moves from a human-driven AI assistant to an AI-prepared human decision.
However, we are cautious here: wrong-lead risk increases when no human seeds context. The structured investigation graph is key to making the reasoning auditable.
Conclusion
Support triage is distributed debugging. The hard part is not finding information — it is assembling context from systems that were never designed to share it.
Claude Code and MCP gave us a way to collapse that fragmentation without building new infrastructure. A repository, a few thin wrappers, and an explicit runbook. Total investment: about 10 hours.
What changed is not the speed (though that matters). It is the shift from active investigation to review. Engineers now spend their time validating hypotheses instead of constructing them.
We think the next step — pre-computed investigations attached to tickets before a human opens them — is where this gets genuinely interesting. But even without that, the current setup has already changed how our team thinks about support.