LumaBrowser embeds a W3C WebDriver Level 2 server directly inside the browser. Any Selenium client, in any language, connects the same way it connects to ChromeDriver — same capabilities, same locator strategies, same protocol. Opt in to lumabyte:llmFallback and brittle CSS selectors self-heal via an LLM when the DOM shifts underneath them.
LumaBrowser’s selenium-driver extension is a standards-compliant WebDriver HTTP server on port 9515 — the same port, path structure, and session lifecycle as ChromeDriver. What you don’t have to change:
ChromeOptions() is accepted as-is. goog:chromeOptions is passed through untouched (not every key is enforced yet — see the matrix below).driver.execute_cdp_cmd(...) still works. ChromeDriver’s goog:cdp/execute endpoint is aliased to LumaBrowser’s lumabyte:cdp/execute passthrough./wd/hub in the extension settings and point your hub at LumaBrowser.Just change the URL:
# Before
driver = webdriver.Remote("http://127.0.0.1:9515", options=ChromeOptions())
# After
driver = webdriver.Remote("http://127.0.0.1:9515", options=ChromeOptions())
# (same URL — ChromeDriver is no longer running; LumaBrowser is.)That’s the entire migration for W3C-compatible suites. Everything below is what you gain by opting in.
Three concrete wins. Each one is opt-in: pay the cost only where you want the benefit.
Tests break when a marketing team renames a CSS class. LumaBrowser retries the find via an LLM using a natural-language description of what the element is — not how it’s currently styled.
Two opt-in modes:
ai-description locator strategy and let the LLM own the resolution end-to-end.# Breaks the moment the button’s class
# name changes during a UI refresh.
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Remote(
"http://127.0.0.1:9515",
options=webdriver.ChromeOptions()
)
button = driver.find_element(
By.CSS_SELECTOR,
"button.btn-primary.signup"
)
button.click()# LLM fallback re-resolves when CSS misses.
from selenium import webdriver
from selenium.webdriver.common.by import By
opts = webdriver.ChromeOptions()
opts.set_capability(
"lumabyte:llmFallback",
{"enabled": True}
)
driver = webdriver.Remote(
"http://127.0.0.1:9515", options=opts
)
# Option A: keep CSS, add an AI hint
button = driver.execute(
"find element",
{"using": "css selector",
"value": "button.btn-primary.signup",
"lumabyte:description": "the sign-up button"}
)["value"]
# Option B: skip CSS entirely
button = driver.find_element(
"ai-description", "the sign-up button"
)
button.click()// Breaks the moment the button’s class
// name changes during a UI refresh.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Remote;
var driver = new RemoteWebDriver(
new Uri("http://127.0.0.1:9515"),
new ChromeOptions()
);
var button = driver.FindElement(
By.CssSelector("button.btn-primary.signup")
);
button.Click();// LLM fallback re-resolves when CSS misses.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Remote;
var opts = new ChromeOptions();
opts.AddAdditionalOption(
"lumabyte:llmFallback",
new Dictionary<string, object> { { "enabled", true } }
);
var driver = new RemoteWebDriver(
new Uri("http://127.0.0.1:9515"), opts
);
// Option A: keep CSS, add an AI hint (vendor param
// rides along on the find-element command).
var found = driver.ExecuteCustomDriverCommand(
DriverCommand.FindElement,
new Dictionary<string, object> {
{ "using", "css selector" },
{ "value", "button.btn-primary.signup" },
{ "lumabyte:description", "the sign-up button" }
}
);
// Option B: skip CSS entirely.
var button = driver.FindElement(
By.Custom("ai-description", "the sign-up button")
);
button.Click();Note: By.Custom requires registering the ai-description strategy on the driver once at startup (see the .NET docs for CustomFinderFactory).
// Breaks the moment the button's class
// name changes during a UI refresh.
const { Builder, By } = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');
const driver = await new Builder()
.usingServer('http://127.0.0.1:9515')
.forBrowser('chrome')
.setChromeOptions(new chrome.Options())
.build();
const button = await driver.findElement(
By.css('button.btn-primary.signup')
);
await button.click();// LLM fallback re-resolves when CSS misses.
const { Builder, By } = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');
const opts = new chrome.Options();
opts.set('lumabyte:llmFallback', { enabled: true });
const driver = await new Builder()
.usingServer('http://127.0.0.1:9515')
.forBrowser('chrome')
.setChromeOptions(opts)
.build();
const { Command } = require('selenium-webdriver/lib/command');
// Option A: keep CSS, add an AI hint.
const foundA = await driver.execute(
new Command('findElement')
.setParameter('using', 'css selector')
.setParameter('value', 'button.btn-primary.signup')
.setParameter('lumabyte:description', 'the sign-up button')
);
// Option B: skip CSS, dispatch the ai-description strategy directly.
const foundB = await driver.execute(
new Command('findElement')
.setParameter('using', 'ai-description')
.setParameter('value', 'the sign-up button')
);
// foundB is a raw element reference; wrap or click via the executor.Note: selenium-webdriver for Node doesn’t expose a first-class hook for custom locator strategies, so the raw findElement command is the honest path here. Python and C# bindings wrap this more ergonomically.
Classic Selenium makes N HTTP calls to read N fields — each one is a round-trip through the WebDriver wire protocol, a DOM query, and a response. LumaBrowser’s vendor endpoint /session/:sessionId/lumabyte/dom/snapshot returns the page URL, title, source, and an optional base64 screenshot in a single call.
# One round-trip per field.
title = driver.find_element(By.CSS_SELECTOR, "h1").text
price = driver.find_element(By.CSS_SELECTOR, ".price").text
rating = driver.find_element(By.CSS_SELECTOR, ".rating").text
stock = driver.find_element(By.CSS_SELECTOR, ".stock").text
# 4 requests, 4 DOM walks, 4 responses.# One call returns URL, title, source, and
# (optionally) a full-page screenshot.
from selenium.webdriver.remote.command import Command
# Register the vendor command once per driver.
driver.command_executor._commands["lumabyteSnapshot"] = (
"POST", "/session/$sessionId/lumabyte/dom/snapshot"
)
snap = driver.execute(
"lumabyteSnapshot",
{"includeScreenshot": True}
)["value"]
# 1 request. Parse locally.// One round-trip per field.
var title = driver.FindElement(By.CssSelector("h1")).Text;
var price = driver.FindElement(By.CssSelector(".price")).Text;
var rating = driver.FindElement(By.CssSelector(".rating")).Text;
var stock = driver.FindElement(By.CssSelector(".stock")).Text;
// 4 requests, 4 DOM walks, 4 responses.// One call returns URL, title, source, and
// (optionally) a full-page screenshot.
using OpenQA.Selenium.Remote;
// ExecuteCustomDriverCommand dispatches a vendor verb
// without polluting CommandInfoRepository at the caller.
var snap = driver.ExecuteCustomDriverCommand(
"lumabyte:dom/snapshot",
new Dictionary<string, object> {
{ "includeScreenshot", true }
}
);
// 1 request. Parse locally.Note: ExecuteCustomDriverCommand needs the vendor command registered on CommandInfoRepository first (one-time TryAddCommand at driver construction) — a two-line helper wraps that setup in most test bases.
// One round-trip per field.
const title = await driver.findElement(By.css('h1')).getText();
const price = await driver.findElement(By.css('.price')).getText();
const rating = await driver.findElement(By.css('.rating')).getText();
const stock = await driver.findElement(By.css('.stock')).getText();
// 4 requests, 4 DOM walks, 4 responses.// One call returns URL, title, source, and
// (optionally) a full-page screenshot.
const { Command } = require('selenium-webdriver/lib/command');
// Teach the executor about the vendor verb once.
driver.getExecutor().defineCommand(
'lumabyteSnapshot',
'POST',
'/session/:sessionId/lumabyte/dom/snapshot'
);
const snap = await driver.execute(
new Command('lumabyteSnapshot')
.setParameter('includeScreenshot', true)
);
// 1 request. Parse locally.Classic Selenium needs a version-matched ChromeDriver binary on every CI agent and every developer laptop. Chrome auto-updates, ChromeDriver doesn’t, and your pipeline breaks at 3am until someone bumps webdriver-manager. LumaBrowser’s WebDriver server runs inside the browser process itself — there is no binary to download, no version drift, no child process to reap.
# Selenium Manager has to download the
# right ChromeDriver for whatever Chrome
# version is installed today.
from selenium import webdriver
driver = webdriver.Chrome() # spawns chromedriver
# Plus: ChromeDriver lifecycle, process
# cleanup, version-match CI matrix, and
# the occasional "session not created:
# This version of ChromeDriver only
# supports Chrome version N" at 3am.# LumaBrowser is already running.
# Connect directly.
from selenium import webdriver
driver = webdriver.Remote(
"http://127.0.0.1:9515",
options=webdriver.ChromeOptions()
)
# No binary download.
# No version pinning.
# No child process.// Selenium Manager has to download the
// right ChromeDriver for whatever Chrome
// version is installed today.
using OpenQA.Selenium.Chrome;
var driver = new ChromeDriver(); // spawns chromedriver
// Plus: ChromeDriver lifecycle, process
// cleanup, version-match CI matrix, and
// the occasional "session not created:
// This version of ChromeDriver only
// supports Chrome version N" at 3am.// LumaBrowser is already running.
// Connect directly.
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Remote;
var driver = new RemoteWebDriver(
new Uri("http://127.0.0.1:9515"),
new ChromeOptions()
);
// No binary download.
// No version pinning.
// No child process.// selenium-webdriver + Selenium Manager has
// to download the right ChromeDriver for
// whatever Chrome version is installed today.
const { Builder } = require('selenium-webdriver');
const driver = await new Builder()
.forBrowser('chrome')
.build();
// Plus: ChromeDriver lifecycle, process
// cleanup, version-match CI matrix, and
// the occasional "session not created:
// This version of ChromeDriver only
// supports Chrome version N" at 3am.// LumaBrowser is already running.
// Connect directly.
const { Builder } = require('selenium-webdriver');
const driver = await new Builder()
.usingServer('http://127.0.0.1:9515')
.forBrowser('chrome')
.build();
// No binary download.
// No version pinning.
// No child process.What a realistic test looks like when you lean on ai-description, per-find hints, and the DOM snapshot together. Notice how close the code reads to the test spec.
import requests
from selenium import webdriver
# 1. Opt in to LLM fallback for this session.
opts = webdriver.ChromeOptions()
opts.set_capability("lumabyte:llmFallback", {
"enabled": True,
"onFindFail": True, # retry failed finds via LLM
"onClickIntercepted": True, # retry intercepted clicks via LLM
})
driver = webdriver.Remote("http://127.0.0.1:9515", options=opts)
session_id = driver.session_id
try:
driver.get("https://example.shop/products/coffee-grinder")
# Describe elements, don't select them.
driver.find_element("ai-description", "the add-to-cart button").click()
driver.find_element("ai-description", "the cart icon in the header").click()
# Pull the whole checkout summary in one call.
snap = requests.post(
f"http://127.0.0.1:9515/session/{session_id}/lumabyte/dom/snapshot",
json={"includeScreenshot": True},
).json()["value"]
assert "Coffee Grinder" in snap["source"]
assert snap["url"].endswith("/cart")
# snap["screenshot"] is base64 PNG — feed straight into your visual diff.
finally:
driver.quit()Twenty-five lines, zero brittle CSS, one round-trip to assert the cart state. Try writing the same suite in classic Selenium and count the selectors you’d need to maintain next quarter.
What works today against the v1 selenium-driver extension, and where the known gaps are:
| Feature | Status | Notes |
|---|---|---|
| W3C session lifecycle (new session, delete session, status) | Supported | Drop-in. |
| Navigation, cookies, timeouts, window handles | Supported | Drop-in. |
| Find by CSS, XPath, link text, partial link text, tag name | Supported | All W3C strategies. |
| Element click / clear / send keys / get text / get attribute | Supported | Drop-in. |
| Execute sync script (with element args) | Supported | POST /execute/sync. |
| Full-page screenshot | Supported | GET /screenshot. |
goog:chromeOptions pass-through | Accepted | Accepted on new-session; not every option key is enforced yet. |
goog:cdp/execute (ChromeDriver CDP) | Supported | Aliased to lumabyte:cdp/execute. |
ai-description locator strategy | LumaBrowser only | Requires lumabyte:llmFallback.enabled=true. |
lumabyte:dom/snapshot vendor endpoint | LumaBrowser only | One-shot URL + title + source + optional screenshot. |
Frame switching (POST /frame) | v1 limitation | Use CDP passthrough for iframe work. |
Shadow DOM subqueries (/shadow/:id/element) | v1 limitation | Use CDP passthrough. |
execute_async_script | v1 limitation | Use sync script + polling for now. |
| Element-scoped screenshot crop | Returns full tab | Endpoint exists but currently returns the full-tab image. |
W3C PUA special-key codepoints (\uE000–\uF8FF) | Literal pass-through | Actions API keymap translation is on the roadmap. |
If a row you depend on says v1 limitation, the CDP passthrough (lumabyte:cdp/execute) covers almost everything at the raw Chrome DevTools Protocol layer while the native WebDriver surface catches up.
Two ways, takes under a minute either way:
http://127.0.0.1:9515.# Start the WebDriver server
curl -X POST http://localhost:3000/api/selenium/start
# Check status
curl http://localhost:3000/api/selenium/statusFor the full endpoint reference, capability payload shape, and MCP tool surface, see the Selenium Driver section of the API docs.
Yes for the W3C-compatible subset — sessions, navigation, the four locator strategies, click/clear/send-keys, cookies, timeouts, sync script execution, full-page screenshots, and goog:chromeOptions pass-through all behave as they do against ChromeDriver. The compatibility matrix above calls out exactly what doesn’t yet.
Whichever slot you configure in the selenium-driver extension settings (selenium.fallback.slot). That slot routes through LumaBrowser’s LLM Service, so you can point it at Anthropic, any OpenAI-compatible endpoint, or a local model via LM Studio or Ollama — same LLM configuration as the rest of the browser.
For normal commands, roughly the same — the WebDriver server is in-process with the browser, so there’s no IPC to a separate driver binary. The LLM fallback adds latency only when the primary selector misses and fallback is enabled for that session. Happy-path tests run at native speed.
Yes. Set the URL prefix to /wd/hub in the selenium-driver settings. Grid clients will route to LumaBrowser without further changes.
Both speak CDP, and LumaBrowser exposes the lumabyte:cdp/execute passthrough plus the ChromeDriver-compatible goog:cdp/execute alias. A dedicated CDP bridge is on the roadmap; for now, drive LumaBrowser via Selenium and use the CDP passthrough for the Playwright/Puppeteer-shaped calls you need.
Install LumaBrowser, enable the selenium-driver extension, and change one URL in your test suite. The W3C parts just work. The LLM fallback is there when you want it.