zymtrace MCP Server
zymtrace exposes GPU and Profiling data directly to AI agents through the Model Context Protocol (MCP).
This turns your performance profiles into an interactive, AI-driven workspaceβwhere you can query flamegraphs, investigate regressions, and get optimization recommendations directly in your IDE or terminal.
MCP is an open standard that enables AI agents to securely connect to external data sources and tools. With zymtrace's MCP server, you can analyze profiles using natural language.
Demoβ
Here's a demo on Claude code optimized Pytorch thermal simulation application, making it 7.5x faster.
Quick Startβ
Step 1: Check Prerequisitesβ
Ensure you have:
- zymtrace backend running (version
25.11.0or higher) - MCP-compatible AI assistant (Claude Desktop/Code, Cursor, Cody, etc.)
- Authentication credentials (if auth is enabled)
Step 2: Find Your MCP Endpointβ
Your MCP server endpoint is your zymtrace gateway URL + /mcp:
https://zymtrace.company.com/mcp
Step 3: Generate Authentication (If Required)β
For basic auth, encode your credentials:
echo -n "your-username:your-password" | base64
Step 4: Configure Your AI Assistantβ
Choose your AI assistant and follow the setup:
Claude Desktop/Codeβ
With authentication:
claude mcp add zymtrace \
--transport http \
https://your-zymtrace-instance.com/mcp \
--header "Authorization: Basic YOUR_BASE64_STRING_HERE"
Without authentication (local development):
claude mcp add zymtrace-local \
--transport http \
http://localhost:8080/mcp
Then verify connectivity with claude mcp list
Cursor IDEβ
Add to your MCP settings:
{
"mcpServers": {
"zymtrace": {
"type": "http",
"url": "https://your-zymtrace-instance.com/mcp",
"headers": {
"Authorization": "Basic YOUR_BASE64_STRING_HERE"
}
}
}
}
How to Query Your Dataβ
Prompt Structureβ
Structure your prompts with these key components for best results:
| Component | Description | Examples |
|---|---|---|
| zymtrace | Always mention zymtrace | "in zymtrace", "using zymtrace" |
| Profile type | Specify CPU or GPU | "CPU flamegraphs", "GPU metrics" |
| Analysis type | What you want to see | flamegraphs, top functions, top entities |
| Entity scope | Where to look | container, namespace, pod, script name |
| Time range | When to analyze | last hour, 24 hours, since yesterday |
Template:
"Analyze [PROFILE_TYPE] [ANALYSIS_TYPE] for [ENTITY] in zymtrace over [TIME_RANGE]"
Query Examplesβ
Optimization and Tuningβ
"Analyze the thermal-diffusion simulation performance using GPU flamegraphs. What optimizations would you recommend based on the last 6 hours of data?"
"Identify the most expensive CUDA kernels in our vllm inference pipeline running in the vllm namespace and suggest improvements."
Performance Investigationβ
"What are the top 5 CPU-intensive functions in the mortgage-api container over the last 24 hours?"
"Show me GPU utilization patterns for the ml-training pod since yesterday. What is the least optimal function and how do I fix it?"
"Which functions in the payment-service pod are consuming the most CPU over the last 1 hour?"
Troubleshootingβ
Connection Issuesβ
MCP Server Connection Failed
- Verify zymtrace backend is running
Token Limit Exceeded
- For Claude Desktop/Code: Increase token limit in conversation settings
- Environment variable: Set
MAX_MCP_OUTPUT_TOKENS=500000 - Query optimization: Use filters to reduce response size