Browser Harness: Give AI Agents Your Real Browser (Not a Sandbox)
Browser Harness is an open-source CDP tool that lets AI agents control your actual Chrome session with all logins intact. Here's how it works and why it matters.
Browser Harness: Give AI Agents Your Real Browser (Not a Sandbox)
Most AI agent browser tools give you a clean, empty browser. No cookies. No logins. No session history. Every run starts from zero.
Browser Harness takes the opposite approach. It connects directly to your already-running Chrome with all your authenticated sessions. Your Gmail, GitHub, Twitter, banking -- everything the AI needs is already there.
What Is Browser Harness?
Browser Harness is an open-source Python tool built directly on the Chrome DevTools Protocol (CDP). Created by the browser-use team, it hit 4,100+ GitHub stars within four days of its April 17, 2026 launch.
The core idea is minimal: one websocket to Chrome, no framework layer between the agent and the browser. The agent sends raw CDP commands and gets back real responses.
Unlike Playwright or Puppeteer, Browser Harness does not launch its own browser. It attaches to your existing Chrome instance via CDP, which means every site where you are already logged in is immediately accessible to the agent.
How It Actually Works
The architecture is straightforward:
Chrome -- CDP WebSocket --> daemon.py --> Unix socket --> run.py (your script)
1. Chrome runs with remote debugging enabled (--remote-debugging-port)
2. The daemon process discovers the CDP websocket URL from DevToolsActivePort in your Chrome profile directory
3. Your Python script communicates with the daemon via a Unix socket at /tmp/bu-default.sock
4. Every command is a raw CDP call -- Input.dispatchMouseEvent for clicks, Runtime.evaluate for JavaScript, Page.navigate for URLs
There is no abstraction layer. No page object models. No selector strategies. You send CDP commands, Chrome executes them.
Installation
git clone https://github.com/browser-use/browser-harness ~/Developer/browser-harness
cd ~/Developer/browser-harness
uv tool install -e .
This installs the browser-harness command globally while keeping it linked to the source checkout. When the agent edits helpers.py mid-session, the next invocation uses the updated code immediately.
Usage Pattern
browser-harness <<'PY'
new_tab("https://github.com")
wait_for_load()
print(page_info())
PY
The tool reads Python from stdin. All helpers are pre-imported. The daemon auto-starts on first use.
Key helper functions:
| Function | What it does |
|---|---|
goto(url) | Navigate current tab |
new_tab(url) | Open a new tab and switch to it |
click(x, y) | Coordinate-based click (passes through iframes and shadow DOM) |
type_text(text) | Insert text at cursor |
press_key(key) | Send special keys (Enter, Tab, etc.) |
screenshot() | Capture current viewport as PNG |
js(expression) | Run JavaScript, return value |
cdp(method, params) | Raw CDP call for anything not covered |
page_info() | Get URL, title, viewport dimensions |
wait_for_load() | Block until page readyState is complete |
Why This Matters for AI Agents
Most AI coding agents (Claude Code, Codex, OpenCode) have no built-in browser access. When they need to interact with a web app, they are stuck.
Browser Harness solves this by becoming the agent's browser interface. The agent can:
- •Browse authenticated web apps (Jira, Notion, internal dashboards)
- •Fill forms, upload files, click buttons
- •Scrape data from pages behind login walls
- •Post to social media as the user
- •Handle multi-step workflows across different sites
The "self-healing" aspect means if a helper function does not exist for what the agent needs, the agent writes it. The README shows this flow:
agent wants to upload a file
helpers.py has no upload_file()
agent edits helpers.py and writes it
upload_file() now exists
file uploaded
Coordinate-Based Clicks: The Default Strategy
Browser Harness defaults to coordinate-based clicks (Input.dispatchMouseEvent) rather than CSS selectors. This is a deliberate design choice.
Coordinate clicks pass through:
- •Cross-origin iframes
- •Shadow DOM boundaries
- •Canvas elements
- •WebGL contexts
The workflow is: screenshot to see the page, identify coordinates, click, screenshot again to verify. For cases where coordinates fail (dynamic layouts, responsive design), js() and DOM queries are available as fallbacks.
Domain Skills: Site-Specific Knowledge
Browser Harness ships with a domain-skills/ directory containing site-specific patterns. These capture non-obvious details about how specific websites work: private API endpoints, stable selectors, framework quirks, URL patterns.
Current domain skills include Medium, Reddit, and TikTok upload patterns. The project encourages agents to contribute new domain skills back via PRs when they discover reusable patterns.
Enabling Remote Debugging in Chrome
On first use, Chrome needs remote debugging enabled:
1. Quit Chrome completely
2. Relaunch with: open -a "Google Chrome" --args --remote-debugging-port=9222
3. Navigate to chrome://inspect/#remote-debugging
4. Tick the checkbox to enable
This setting is sticky per profile. Once enabled, Chrome will serve CDP on subsequent launches without needing the flag again.
The daemon discovers the debugging port by reading DevToolsActivePort from the Chrome profile directory. On macOS, that is ~/Library/Application Support/Google/Chrome/DevToolsActivePort. Chrome 144 and later do not serve the HTTP /json/version endpoint, so this file-based discovery is the only reliable method.
Remote Browser Support
For headless servers or parallel sub-agents, Browser Harness supports Browser Use cloud browsers. Each agent gets its own isolated browser via a distinct BU_NAME:
BU_NAME=agent1 browser-harness <<'PY'
new_tab("https://example.com")
print(page_info())
PY
Remote browsers are provisioned via the Browser Use API and automatically stopped on daemon shutdown to prevent billing accumulation.
Real-World Test: Posting to X
We tested Browser Harness by posting a thread to X from a free-tier account (280 character limit per tweet). The process:
1. Navigate to x.com/compose/post via goto()
2. Click the text area using coordinate-based click()
3. Type the tweet using type_text()
4. Find the Post button via Runtime.evaluate + getBoundingClientRect()
5. Click it using CDP Input.dispatchMouseEvent
For threads, we posted the first tweet, then navigated to its detail page and replied sequentially. Each reply follows the same pattern: click reply area, type, click Reply button.
The key challenge was button detection. X uses React with no stable data-testid attributes on the Post button. The reliable method was using document.querySelectorAll("button") and filtering by innerText === "Post" combined with position and disabled state.
Gotchas We Hit
Chrome singleton problem: On macOS, if Chrome is already running, launching a new instance with --remote-debugging-port just opens a window in the existing process (which was not started with debugging). You must fully quit Chrome first.
CDP connection refused (403): The first CDP connection triggers a Chrome permission dialog. The user must click "Allow" in Chrome before the daemon can connect.
DevToolsActivePort exists but port not listening: The file is written before the port is actually ready. Poll for up to 30 seconds rather than treating connection refused as a permanent failure.
Mask overlay blocking clicks: The compose dialog on X has a mask overlay (data-testid="mask") that intercepts elementFromPoint(). Use getBoundingClientRect() to find button positions instead.
Comparison: Browser Harness vs Alternatives
| Feature | Browser Harness | Playwright | Puppeteer | Browser Use |
|---|---|---|---|---|
| Uses existing Chrome | Yes | No (launches own) | No (launches own) | No (cloud only) |
| Authenticated sessions | Yes (your Chrome profile) | No | No | Via cloud profiles |
| Protocol | Raw CDP | CDP (abstracted) | CDP (abstracted) | CDP (managed) |
| Abstraction level | Minimal | Full framework | Full framework | Full framework |
| Self-healing | Agent writes missing helpers | N/A | N/A | N/A |
| Remote browsers | Optional (Browser Use cloud) | No | No | Yes (core feature) |
| Cost | Free (MIT) | Free | Free | Paid cloud tier |
| Stars (Apr 2026) | 4,100+ | 70K+ | 90K+ | 55K+ |
When to Use Browser Harness
Use it when:
- •Your AI agent needs to interact with sites where you are already logged in
- •You want minimal abstraction between the agent and the browser
- •You need the agent to write its own helpers for novel interactions
- •You are running Claude Code, Codex, or another agent that lacks built-in browser access
Skip it when:
- •You need a clean browser session for testing (use Playwright)
- •You are running automated test suites with assertions
- •You want a framework with page objects, selectors, and waits built in
Bottom Line
Browser Harness fills a specific gap in the AI agent toolchain: giving agents access to your real, authenticated browser. It does this with minimal code (the core is under 200 lines), no framework overhead, and a self-healing design where the agent extends the tool as needed.
At 4,100+ stars in four days, it clearly resonates with developers building agent workflows. The MIT license and editable install make it easy to extend for specific use cases.
GitHub: browser-use/browser-harness
Share this article
About NeuralStackly
Expert researcher and writer at NeuralStackly, dedicated to finding the best AI tools to boost productivity and business growth.
View all postsRelated Articles
Continue reading with these related posts
OpenAI Codex Computer Use Review: Background Agents That Control Your Entire Mac
OpenAI Codex Computer Use Review: Background Agents That Control Your Entire Mac
OpenAI Codex now controls your Mac in the background with computer use, scheduled tasks, 90 plugins, and an in-app browser. Full review of the April 2026 update.
Best AI Coding Assistants 2026: Cursor vs Windsurf vs Augment Code vs GitHub Copilot
Best AI Coding Assistants 2026: Cursor vs Windsurf vs Augment Code vs GitHub Copilot
Comprehensive comparison of the top AI coding assistants in 2026. Real pricing, features, and honest recommendations for developers.
Claude Design Review: Anthropic's AI Design Tool vs Figma, Canva, and Vercel v0
Claude Design Review: Anthropic's AI Design Tool vs Figma, Canva, and Vercel v0
Anthropic launched Claude Design, a conversational AI tool that turns text prompts into app prototypes, slide decks, and marketing visuals. We break down what it does, who it is...