Agentic Workflows (MCP)
RegressionBot natively supports the Model Context Protocol (MCP). Expose visual testing tools directly to your AI agents (Claude Code, Gemini, Windsurf, Cursor) via our remote HTTP MCP server. This allows your agents to autonomously run regression tests, check statuses, and approve baselines without any local installation.
Agent Examples
These examples show what a real session looks like with an MCP-enabled Agentic IDE (like Claude Code, Cursor, Windsurf, etc.) with the RegressionBot MCP connected. The agent handles polling, result interpretation, and image analysis automatically.
Compare a preview URL against production baselines
> run visual regression on https://staging.myapp.com against stored baselines
I'll run a visual regression test using the RegressionBot MCP.
[run_regression_job]
testOrigin: "https://staging.myapp.com"
project: "myapp"
devices: ["Desktop Chrome", "mobile"]
→ {
"message": "Job started",
"jobId": "6ee84bac-40de-44cc-8dca-4760ab108153"
}
[get_job_status] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153
→ {
"jobId": "6ee84bac-40de-44cc-8dca-4760ab108153",
"status": "PROCESSING",
"isComplete": false,
"progress": { "total": 6, "completed": 3, "percent": "50.00" }
}
[get_job_status] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153
→ {
"jobId": "6ee84bac-40de-44cc-8dca-4760ab108153",
"status": "SUMMARIZING",
"isComplete": false,
"progress": { "total": 6, "completed": 6, "percent": "100.00" }
}
[get_job_status] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153
→ {
"status": "COMPLETED",
"isComplete": true,
"progress": { "total": 6, "completed": 6, "percent": "100.00" },
"executionTime": 48
}
[get_job_summary] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153, regressionsOnly: true
→ {
"regressionCount": 2,
"matchCount": 4,
"newBaselineCount": 0,
"results": [
{
"url": "https://staging.myapp.com/pricing",
"variantName": "Desktop Chrome",
"status": "SUCCESS",
"diffPercentage": 3.638671875,
"score": 96.361328125,
"isNewBaseline": false,
"regressionbotSummary": "The pricing table layout changed: the Pro tier
card is now highlighted with a border and the CTA button colour shifted
from grey to blue. No structural breakage detected.",
"diffUrl": "https://s3.amazonaws.com/..."
},
{
"url": "https://staging.myapp.com/pricing",
"variantName": "mobile",
"status": "SUCCESS",
"diffPercentage": 3.638671875,
"score": 96.361328125,
"isNewBaseline": false,
"regressionbotSummary": "Same pricing change on mobile — the card
highlight and button colour update are visible. Font sizes unchanged.",
"diffUrl": "https://s3.amazonaws.com/..."
}
]
}
The job completed in 48 seconds. I found 2 regressions:
• /pricing [Desktop Chrome] — 3.64% diff, score 96
"The Pro tier card is now highlighted with a border and the CTA button
colour shifted from grey to blue."
• /pricing [mobile] — 3.64% diff, score 96
"Same change on mobile — card highlight and button colour visible."
The other 4 pages matched their baselines exactly.
Would you like me to fetch the diff images and review them before approving?Inspect a diff image and approve
> yes, look at the pricing diff and tell me if the change looks intentional
[fetching diffUrl for /pricing Desktop Chrome via HTTP GET...]
I can see the 3-panel diff — BEFORE | AFTER | ANNOTATED DIFF.
The changes are clearly intentional:
- The Pro plan card now has a visible border and background highlight
- The "Get started" button changed from #6b7280 to #2563eb (grey → blue)
- No layout shifts, broken text, or clipped elements
This looks like a deliberate design update. Safe to approve.
> approve it
[approve_job] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153
→ {
"message": "Job approved",
"approvedCount": 2,
"failedCount": 0
}
Done. The 2 updated screenshots have been promoted to new baselines.
Future runs will compare against the new design.Run multiple jobs in parallel across devices
> test the PR preview against baselines on desktop, mobile, and tablet
I'll start 3 parallel jobs — one per device — then poll all of them.
[run_regression_job] project: "myapp", devices: ["Desktop Chrome"]
→ { "jobId": "31c66aa4-5bd7-4df1-a8e3-1394d77a33f9" }
[run_regression_job] project: "myapp", devices: ["mobile"]
→ { "jobId": "1a597938-b08e-4417-9802-cbb457870c96" }
[run_regression_job] project: "myapp", devices: ["tablet"]
→ { "jobId": "d5452cb0-6b7a-4186-a0b1-0e23204a31e3" }
[get_job_status] 31c66aa4 → PROCESSING 33%
[get_job_status] 1a597938 → PROCESSING 50%
[get_job_status] d5452cb0 → PROCESSING 17%
...
[get_job_status] 31c66aa4 → COMPLETED ✓
[get_job_status] 1a597938 → COMPLETED ✓
[get_job_status] d5452cb0 → COMPLETED ✓
[get_job_summary] 31c66aa4, regressionsOnly: true → 1 regression
[get_job_summary] 1a597938, regressionsOnly: true → 1 regression
[get_job_summary] d5452cb0, regressionsOnly: true → 0 regressions
Results across all devices:
• /about [Desktop Chrome] — 3.64% diff, score 96
"Hero section padding increased, CTA button widened."
• /about [mobile] — 3.64% diff, score 96
"Same layout change on mobile — consistent with desktop update."
• All tablet pages matched baselines.Recover a previous job by listing recent runs
> what was the result of the last run for the myapp project?
[list_jobs] project: "myapp", limit: 10
→ {
"jobs": [
{
"jobId": "6ee84bac-40de-44cc-8dca-4760ab108153",
"project": "myapp",
"status": "APPROVED",
"totalUrls": 6,
"completedCount": 6,
"createdAt": "2026-05-19T10:53:26.000Z",
"finishedAt": "2026-05-19T10:54:14.000Z"
},
{
"jobId": "e69062bb-f140-40e7-a3e8-40f84deadcdb",
"project": "myapp",
"status": "COMPLETED",
"totalUrls": 6,
"completedCount": 6,
"createdAt": "2026-05-17T13:22:01.000Z",
"finishedAt": "2026-05-17T13:22:06.000Z"
}
]
}
Most recent: job 6ee84bac — APPROVED, 6 pages tested, finished ~1 hour ago.
The run before that is COMPLETED but not yet approved (2 regressions pending).Tip: The agent can fetch and visually inspect diff images directly from the pre-signed diffUrl in the results. Each diff is a 3-panel image — BEFORE | AFTER | ANNOTATED DIFF — with changed pixels highlighted in red and bounding boxes drawn around clusters of changes. This is the same image used to generate regressionbotSummary.
Claude Code
Agentic IDEs and CLIs (such as Anthropic's Claude Code, Cursor, Windsurf, and Gemini CLI) natively support Streamable HTTP transports and inline header injection via command line or visual settings.
claude mcp add --transport http regressionbot https://mcp.regressionbot.com/ --header "x-api-key: YOUR_API_KEY"Cursor
Configure remote servers directly through Cursor's graphical interface to keep your API keys encrypted.
- Open Cursor Settings and navigate to Features > MCP Servers.
- Click + Add New MCP Server.
- Fill out the configuration:
- Name:
regressionbot - Type:
http - URL:
https://mcp.regressionbot.com/
- Name:
- Add a header row with
x-api-keyas the key and your API key as the value. - Click Save and verify the green status indicator.
Windsurf
Windsurf handles remote HTTP MCPs using a structured JSON file. Add the following to your ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"regressionbot": {
"serverUrl": "https://mcp.regressionbot.com/",
"headers": {
"x-api-key": "YOUR_API_KEY"
}
}
}
}Gemini CLI
Gemini CLI natively supports direct Streamable HTTP transport. You can add it via command line or settings file.
gemini mcp add --transport http regressionbot https://mcp.regressionbot.com/ --header "x-api-key: YOUR_API_KEY"{
"mcpServers": {
"regressionbot": {
"httpUrl": "https://mcp.regressionbot.com/",
"headers": {
"x-api-key": "YOUR_API_KEY"
}
}
}
}Available Tools
Once connected, your agent has access to the following tools. The MCP server provides instructions automatically, so your agent understands the workflow without any additional prompting.
| Tool | What it does |
|---|---|
| run_regression_job | Start a visual regression job. Accepts explicit paths, a sitemapUrl to crawl, or glob scans to filter a sitemap. Supports multiple devices, element masks, and an optional baseOrigin for ad-hoc origin-vs-origin comparison. |
| get_job_status | Poll progress. Returns status, progress (total / completed / percent), executionTime, and an isComplete flag so your agent knows when to stop polling. Status lifecycle: INITIALIZING → PROCESSING → SUMMARIZING → COMPLETED (or FAILED / APPROVED). |
| get_job_summary | Get full results once the job is complete. Each result includes diffPercentage, score, visualMatchScore, an AI-written regressionbotSummary, and pre-signed S3 URLs for baselineUrl, currentUrl, and diffUrl. Pass regressionsOnly: true to filter out matching pages. |
| list_jobs | List recent jobs ordered by most recent first. Filter by project or set a limit. Useful for recovering a jobId or checking the status of recent runs without leaving the agent session. |
| approve_job | Promote current screenshots to baselines, accepting all visual changes as intentional. This is permanent — existing baselines are overwritten. Only call after reviewing regressions in get_job_summary. |