Skip to main content
AI ToolsApril 21, 20268 min

Browser Harness: Give AI Agents Your Real Browser (Not a Sandbox)

Browser Harness is an open-source CDP tool that lets AI agents control your actual Chrome session with all logins intact. Here's how it works and why it matters.

NeuralStackly
Author
Journal

Browser Harness: Give AI Agents Your Real Browser (Not a Sandbox)

Browser Harness: Give AI Agents Your Real Browser (Not a Sandbox)

Most AI agent browser tools give you a clean, empty browser. No cookies. No logins. No session history. Every run starts from zero.

Browser Harness takes the opposite approach. It connects directly to your already-running Chrome with all your authenticated sessions. Your Gmail, GitHub, Twitter, banking -- everything the AI needs is already there.

What Is Browser Harness?

Browser Harness is an open-source Python tool built directly on the Chrome DevTools Protocol (CDP). Created by the browser-use team, it hit 4,100+ GitHub stars within four days of its April 17, 2026 launch.

The core idea is minimal: one websocket to Chrome, no framework layer between the agent and the browser. The agent sends raw CDP commands and gets back real responses.

Unlike Playwright or Puppeteer, Browser Harness does not launch its own browser. It attaches to your existing Chrome instance via CDP, which means every site where you are already logged in is immediately accessible to the agent.

How It Actually Works

The architecture is straightforward:

Chrome -- CDP WebSocket --> daemon.py --> Unix socket --> run.py (your script)

1. Chrome runs with remote debugging enabled (--remote-debugging-port)

2. The daemon process discovers the CDP websocket URL from DevToolsActivePort in your Chrome profile directory

3. Your Python script communicates with the daemon via a Unix socket at /tmp/bu-default.sock

4. Every command is a raw CDP call -- Input.dispatchMouseEvent for clicks, Runtime.evaluate for JavaScript, Page.navigate for URLs

There is no abstraction layer. No page object models. No selector strategies. You send CDP commands, Chrome executes them.

Installation

git clone https://github.com/browser-use/browser-harness ~/Developer/browser-harness
cd ~/Developer/browser-harness
uv tool install -e .

This installs the browser-harness command globally while keeping it linked to the source checkout. When the agent edits helpers.py mid-session, the next invocation uses the updated code immediately.

Usage Pattern

browser-harness <<'PY'
new_tab("https://github.com")
wait_for_load()
print(page_info())
PY

The tool reads Python from stdin. All helpers are pre-imported. The daemon auto-starts on first use.

Key helper functions:

FunctionWhat it does
goto(url)Navigate current tab
new_tab(url)Open a new tab and switch to it
click(x, y)Coordinate-based click (passes through iframes and shadow DOM)
type_text(text)Insert text at cursor
press_key(key)Send special keys (Enter, Tab, etc.)
screenshot()Capture current viewport as PNG
js(expression)Run JavaScript, return value
cdp(method, params)Raw CDP call for anything not covered
page_info()Get URL, title, viewport dimensions
wait_for_load()Block until page readyState is complete

Why This Matters for AI Agents

Most AI coding agents (Claude Code, Codex, OpenCode) have no built-in browser access. When they need to interact with a web app, they are stuck.

Browser Harness solves this by becoming the agent's browser interface. The agent can:

  • Browse authenticated web apps (Jira, Notion, internal dashboards)
  • Fill forms, upload files, click buttons
  • Scrape data from pages behind login walls
  • Post to social media as the user
  • Handle multi-step workflows across different sites

The "self-healing" aspect means if a helper function does not exist for what the agent needs, the agent writes it. The README shows this flow:

agent wants to upload a file
helpers.py has no upload_file()
agent edits helpers.py and writes it
upload_file() now exists
file uploaded

Coordinate-Based Clicks: The Default Strategy

Browser Harness defaults to coordinate-based clicks (Input.dispatchMouseEvent) rather than CSS selectors. This is a deliberate design choice.

Coordinate clicks pass through:

  • Cross-origin iframes
  • Shadow DOM boundaries
  • Canvas elements
  • WebGL contexts

The workflow is: screenshot to see the page, identify coordinates, click, screenshot again to verify. For cases where coordinates fail (dynamic layouts, responsive design), js() and DOM queries are available as fallbacks.

Domain Skills: Site-Specific Knowledge

Browser Harness ships with a domain-skills/ directory containing site-specific patterns. These capture non-obvious details about how specific websites work: private API endpoints, stable selectors, framework quirks, URL patterns.

Current domain skills include Medium, Reddit, and TikTok upload patterns. The project encourages agents to contribute new domain skills back via PRs when they discover reusable patterns.

Enabling Remote Debugging in Chrome

On first use, Chrome needs remote debugging enabled:

1. Quit Chrome completely

2. Relaunch with: open -a "Google Chrome" --args --remote-debugging-port=9222

3. Navigate to chrome://inspect/#remote-debugging

4. Tick the checkbox to enable

This setting is sticky per profile. Once enabled, Chrome will serve CDP on subsequent launches without needing the flag again.

The daemon discovers the debugging port by reading DevToolsActivePort from the Chrome profile directory. On macOS, that is ~/Library/Application Support/Google/Chrome/DevToolsActivePort. Chrome 144 and later do not serve the HTTP /json/version endpoint, so this file-based discovery is the only reliable method.

Remote Browser Support

For headless servers or parallel sub-agents, Browser Harness supports Browser Use cloud browsers. Each agent gets its own isolated browser via a distinct BU_NAME:

BU_NAME=agent1 browser-harness <<'PY'
new_tab("https://example.com")
print(page_info())
PY

Remote browsers are provisioned via the Browser Use API and automatically stopped on daemon shutdown to prevent billing accumulation.

Real-World Test: Posting to X

We tested Browser Harness by posting a thread to X from a free-tier account (280 character limit per tweet). The process:

1. Navigate to x.com/compose/post via goto()

2. Click the text area using coordinate-based click()

3. Type the tweet using type_text()

4. Find the Post button via Runtime.evaluate + getBoundingClientRect()

5. Click it using CDP Input.dispatchMouseEvent

For threads, we posted the first tweet, then navigated to its detail page and replied sequentially. Each reply follows the same pattern: click reply area, type, click Reply button.

The key challenge was button detection. X uses React with no stable data-testid attributes on the Post button. The reliable method was using document.querySelectorAll("button") and filtering by innerText === "Post" combined with position and disabled state.

Gotchas We Hit

Chrome singleton problem: On macOS, if Chrome is already running, launching a new instance with --remote-debugging-port just opens a window in the existing process (which was not started with debugging). You must fully quit Chrome first.

CDP connection refused (403): The first CDP connection triggers a Chrome permission dialog. The user must click "Allow" in Chrome before the daemon can connect.

DevToolsActivePort exists but port not listening: The file is written before the port is actually ready. Poll for up to 30 seconds rather than treating connection refused as a permanent failure.

Mask overlay blocking clicks: The compose dialog on X has a mask overlay (data-testid="mask") that intercepts elementFromPoint(). Use getBoundingClientRect() to find button positions instead.

Comparison: Browser Harness vs Alternatives

FeatureBrowser HarnessPlaywrightPuppeteerBrowser Use
Uses existing ChromeYesNo (launches own)No (launches own)No (cloud only)
Authenticated sessionsYes (your Chrome profile)NoNoVia cloud profiles
ProtocolRaw CDPCDP (abstracted)CDP (abstracted)CDP (managed)
Abstraction levelMinimalFull frameworkFull frameworkFull framework
Self-healingAgent writes missing helpersN/AN/AN/A
Remote browsersOptional (Browser Use cloud)NoNoYes (core feature)
CostFree (MIT)FreeFreePaid cloud tier
Stars (Apr 2026)4,100+70K+90K+55K+

When to Use Browser Harness

Use it when:

  • Your AI agent needs to interact with sites where you are already logged in
  • You want minimal abstraction between the agent and the browser
  • You need the agent to write its own helpers for novel interactions
  • You are running Claude Code, Codex, or another agent that lacks built-in browser access

Skip it when:

  • You need a clean browser session for testing (use Playwright)
  • You are running automated test suites with assertions
  • You want a framework with page objects, selectors, and waits built in

Bottom Line

Browser Harness fills a specific gap in the AI agent toolchain: giving agents access to your real, authenticated browser. It does this with minimal code (the core is under 200 lines), no framework overhead, and a self-healing design where the agent extends the tool as needed.

At 4,100+ stars in four days, it clearly resonates with developers building agent workflows. The MIT license and editable install make it easy to extend for specific use cases.

GitHub: browser-use/browser-harness

Share this article

N

About NeuralStackly

Expert researcher and writer at NeuralStackly, dedicated to finding the best AI tools to boost productivity and business growth.

View all posts

Related Articles

Continue reading with these related posts