Tool · Playbook

Give your AI real context.

AI is only as good as the context you give it. An agent that cannot see your code, your spec, and the conversations around the work is guessing. Connect the real sources through Model Context Protocol (MCP), and know exactly where the human stays non-negotiable.

An agent working blind is just guessing

You would not ask a new tester to find bugs without showing them the app, the spec, and where the team made its decisions. An AI agent is no different. Most disappointing AI output is not the model being weak. It is the model working blind, asked to reason about a product it has never been allowed to see. Give it what a good teammate would already have, and the quality of what comes back changes completely.

Real access does not move the judgment. It moves the grunt work. The agent reads, analyses, and drafts against your actual system. You keep the risk call, the visual sense, and the decision to ship.

Your real context

GitHub / your repofrontend and backend code

Confluence and Jirathe spec and the tickets

Slackteam conversations

Past bugswhat has broken before

Google Docsmeeting notes

The running appdriven via Playwright

context in

AI agent

reads, analyses, drafts

judgment stays

You

risk, visual sense, exploratory testing, and the call to ship

What to connect, and what each one gives QA

These are real MCP servers: the sources a good teammate already has. Switch on the ones that fit your stack, starting with your code.

Switch on	What it gives QA	Cost
Filesystem / Git	Reads your actual repo, so it tests what is implemented and not what it assumes. Pair it with a CLAUDE.md or AGENTS.md so it follows your conventions and knows your build and test commands.	free
GitHub	Searches code, reads and opens issues, and manages pull requests. This is the find it, fix it, open a PR loop.	free
Atlassian (Jira, Confluence)	The official Rovo server reads the spec and the acceptance criteria, and files or updates tickets, all with your own permissions. Tests get grounded in intended behaviour, not a guess.	tiered
Slack	Reads the channels and threads where the real decisions and edge cases were worked out, the context that never reaches the ticket.	free
Google Drive / Docs	Turns a meeting note into a structured ticket or bug in your own format, filed where it belongs.	tiered
Playwright	Drives a real browser through the accessibility tree rather than screenshots, so the agent exercises the UI the way a user, and a screen reader, would.	free

The repo is the one to connect first. Playwright for the running app comes a close second, because it lets the agent exercise the real UI instead of reasoning about screenshots. The rest you add as the work needs them.

Your first ten minutes

Connecting a server is a small file you commit or a single command. Here is the whole path, and what each step actually looks like on your screen.

1
Give it your code. Drop a .mcp.json file at your repo root. That one file points the agent at your real frontend and backend, and gives it a browser to drive.
2
Add your hosted tools. Connect GitHub, Jira, Confluence, and Slack with one command each. You sign in as yourself, so the agent only ever sees what you can see.
3
Check it connected. Run /mcp and confirm every server says connected. A server that is missing is a source the agent is still guessing about.
4
Prove it has your context. Ask one question that only works if the wiring is real, and tell it to cite its sources. Real file paths and ticket numbers mean you are good to go.

Step 1. Drop this one file at your repo root. It hands the agent your real code and a browser to drive:

.mcp.json

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "."]
    },
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

the file above plus the verify, file-a-bug, and pull-request prompts, in one paste

# Give your AI real context for QA: starter kit
# From https://juliapottinger.com/resources/give-ai-real-context/

## 1. Drop this .mcp.json at your repo root

{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "."]
},
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}

## 2. Verify the wiring is real

Show me what you can see for this project. List the repo, the issue tracker, the chat workspace, and the running app, and for each one name one specific real thing: a file path, an open ticket, a channel. If you cannot reach one, tell me.

## 3. Turn a conversation into a filed bug

Read the cart-bug discussion in the #support channel, then turn it into a single filed bug in our issue tracker. Pull the reproduction details, the affected browsers, and the free-plan account type out of the conversation, and confirm the likely cause by checking the cart retention and cleanup code in the repo, especially anything that changed in the 1.42 release. Write clear numbered steps, a specific expected versus actual, and list the concrete evidence to attach, including the failing cart request and the relevant server log line. Set severity to High and assign it to the cart team.

## 4. Take the same bug to a pull request

The saved-cart bug is confirmed: the cleanup job is keying off the free-plan retention timestamp instead of the cart timestamp. Find the exact code in the repo, make the smallest fix that keeps saved carts independent of the account retention window, and add a test that fails on the old behaviour and passes on the new one. Open a pull request with a clear description and the test output. Change nothing else.

Steps 2 and 3. Add each hosted server with one command, then run /mcp to watch them connect. Check each provider's MCP docs for the current URL:

your-project · claude

$claude mcp add github --transport http <the GitHub MCP url>

✓ Added github. A browser opened for you to sign in.

>/mcp

●filesystemconnectedreads your repo

●playwrightconnecteddrives the browser

●githubconnectedcode, issues, PRs

●atlassianconnectedJira and Confluence

●slackconnectedchannels and threads

Every server you care about should say connected. A missing one is a blind spot: a source the agent will guess about instead of read.

Step 4. Prove it. Ask one thing that only works if the wiring is real, and tell it to cite sources:

it should reply with real file paths, tickets, and channels

If it answers with specifics, you are connected. If it stays vague, something is not wired up yet. Pair all of this with a CLAUDE.md at the repo root for your conventions, and the agent has both your standards and your live systems.

The workflows it unlocks

Here is what this looks like once it is really set up, and it is the part I did not expect to love. With my repo open in the editor, the front end and the back end both in view, plus Slack, Confluence, and the tracker connected, the agent can reach everything I can. So the tedious automation work mostly disappears. I do not go hunting for a locator; it reads the frontend and finds a stable one. It finds the API calls. It finds bugs. It goes past what I asked and does the deep research in seconds, and because it is reading my actual application and everything I have built, what it gives back is connected and specific to my system, not a generic answer. That is the whole reason to give it real access.

Conversation to ticket. A Slack thread or meeting note becomes a structured bug, filed where it belongs, with reproduction, expected vs actual, evidence, and severity.

Real bug hunting. With the code in context, ask it to deep-dive a flow and find where it breaks, not confirm the happy path.

Find it, fix it, PR it. When you find a bug, the agent that understands the system helps make a small fix and open a pull request for review.

Ask it to explain. Less technical? Have it explain the system or a failing test, then verify against the running app.

Offload analysis, keep judgment. Let it summarise, find patterns, and draft. Keep risk, evidence, and the call to ship for yourself.

Within reason: scope access to the task, prefer read access and test data, and keep a human reviewing every change. Real access speeds the work up. It does not move the release decision.

One workflow, worked through: conversation to a filed bug

Here is the most common one in practice. A bug surfaces in a chat channel, gets half-reproduced and guessed at, then lost because everyone is busy. Give an agent access to that channel, the repo, and the issue tracker, and one instruction turns the mess into a filed bug you can act on:

it files one clean bug from the whole thread

Before: the thread

Julia (Support)Hey, getting a few tickets about saved carts going empty on Nyam Box. People add stuff, leave, come back later and it is wiped. Anyone else seen this?

Julia P. (Engineering)Hmm not off the top of my head. Works for me when I test it. Are they signed in or checking out as guest?

Julia (Support)At least two of them are on the free plan I think. One said she came back the next morning and it was gone.

J. Pottinger (Product)I could kind of reproduce it earlier actually. Added three items, closed the tab, opened it again a while later and the cart said 0. This was on Firefox if that matters.

Julia P. (Engineering)Interesting, I was on Chrome. Maybe the cart cookie is expiring early for free accounts? We changed the session length stuff last sprint.

Jules (Engineering)Oh yeah that timing lines up with the 1.42 rollout. We dropped the retention window for free users but the cart was supposed to be separate from that. Could be related.

J. Pottinger (Product)This is going to annoy people honestly. Free users are exactly the ones we want sticking around long enough to upgrade.

Julia (Support)Can someone file this properly? I keep losing track of it in here and I have to get back to the queue.

Julia P. (Engineering)Yeah agreed it should be a ticket. I am heads down on the export thing till Thursday though.

After: the filed bug

Saved cart is emptied for free-plan users after the session expires

High Nyam Box web app, cart and checkout area. Reproduced on Firefox 126 and Chrome 125, signed-in free-plan accounts. Build 1.42, which shortened the retention window for free users.

Steps to reproduce

Sign in with a free-plan account.
Add three or more items to the cart.
Close the browser tab and leave the account idle for longer than the free-plan retention window, which is now 12 hours in build 1.42.
Open the app again and sign back in with the same account.
Open the cart.

Expected

The cart still holds the three items that were added. Saved carts are meant to persist independently of the account retention window, so changing that window should not clear them.

Actual

The cart shows zero items. Everything the user added is gone, with no message explaining why.

Evidence

Attach a screen recording of the empty cart after sign-in, plus the network capture (HAR file) of the cart load request. The response from GET /api/v1/cart returns an empty items array with a 200 status. Also include the server log line where the cart cleanup job runs against the free-plan retention timestamp instead of the cart timestamp.

Impact

Free-plan users only, both Firefox and Chrome. Started with build 1.42. Likely affecting a meaningful share of free accounts, since the retention window is short enough that normal overnight gaps trigger it. These are the users we most want to keep through the upgrade decision, so the impact on conversion matters as much as the broken behavior.

The agent did not invent any of this. It read the conversation, confirmed the cause against the code, and wrote it up in your format. You review the filed bug, not the messy thread. The structure it follows is the one in the bug report template.

The same bug, taken to a pull request

Filing it is half the loop. Because the agent already has the repo, you can hand the confirmed bug straight back and ask it to fix it, prove the fix, and open a pull request you review. One instruction:

it returns a PR with a test that proves the fix

fix/saved-cart-retention

Keep saved carts independent of the free-plan retention window

What changed

cart/cleanup-job.ts: compare each cart against its own updatedAt, not the account retention timestamp.
cart/cleanup-job.test.ts: new test that a free-plan saved cart survives past the account retention window.

The test it added

cart cleanup > keeps a free-plan saved cart past the account retention window: passing. The same test fails on the previous code, so it locks the bug out.

Left to you

Review the diff, confirm this is the retention rule the product actually wants, and decide whether it ships as a hotfix or in the next release.

This is the part to hold onto. The agent did the finding, the fixing, and the proving, and it brought evidence: a test that fails on the old code and passes on the new one. The merge decision is still yours. That is what real access looks like in practice. It speeds the work up and leaves the judgment where it belongs.

Where AI stops and you start

AI is not one thing. Hand it what it is good at, and keep what needs a human, especially anything visual. I am building a game right now, and this is exactly where it bites: the agent cannot see when something is visually off, a button too big, an element a few pixels out of line, a screen that falls apart on a real device. Across real screen sizes and on mobile, that judgment is still my eyes, not the model's.

Hand it to AI

Contracts and integrations between services
Combing through large amounts of data
Spotting the backend is not returning what it should
Generating web test scenarios from lots of training data
Summarising logs and clustering failures

Keep it human

Seeing a UI is off-centre or a button is too big
Visual and game testing across real devices and screen sizes
Unified end-to-end user-flow testing
Exploratory testing and finding the real gaps
The business decisions and the call to ship

Keep it safe

Real access is powerful, so scope it the way you would for any new teammate. Three rules cover almost everything.

Scope access to the task. An agent does not need blanket write access to production to read a flow and reason about it.
Prefer read access and test data. Be deliberate about credentials and customer records, the same way you would for a new hire.
A human reviews every change before it merges or ships. Real access speeds the work up. It does not move the release decision.

Two more in this set

AI test automation standards (CLAUDE.md) AI test plan generator