Puppeteer vs Selenium

Two protocols, one browser, the same AI. Pick the one your stack already knows.

LumaBrowser ships two automation drivers side by side: a W3C WebDriver server on 9515 for the Selenium ecosystem, and a Chrome DevTools Protocol server on 9222 for Puppeteer. They run concurrently, expose identical natural-language element resolution, and let you bring the client tools your team already trusts. Choosing a protocol is a question of ergonomics, not capability.

What they share

Before we dive into differences, the things that are not a tradeoff:

  • The same LLM selector fallback. Natural-language element resolution (ai-description in Selenium, Lumabyte.find / Lumabyte.click in Puppeteer) routes through one shared orchestrator: deterministic accessibility match → LLM attempt → validation → background shadow-template regeneration. Improvements to the pipeline ship to both drivers at once.
  • The same LLM slot. Both drivers call the core selector-resolver slot. Configure a fast, low-latency model once and every description-based call from either client benefits.
  • The same DOM-snapshot shortcut. Both expose a single-round-trip page-capture endpoint: POST /lumabyte/dom/snapshot in Selenium, Lumabyte.domSnapshot in Puppeteer. Useful for agent loops that want a full page state without three separate commands.
  • The same MCP control surface. Start, stop, and inspect either driver from Claude or any MCP client using the selenium_driver_* or puppeteer_driver_* tools.
  • The same host process. Both drivers live in-process with the browser — no ChromeDriver binary, no external Chromium launch, no version drift.
Side-by-side
Dimension Puppeteer Driver Selenium Driver
Wire protocol Chrome DevTools Protocol (CDP) over WebSocket W3C WebDriver Level 2 over HTTP
Default port 9222 (Chrome's DevTools convention) 9515 (ChromeDriver's port)
Wire format JSON-RPC: { id, sessionId?, method, params } REST: HTTP verb + URL path + JSON body
Events Push. Chromium streams Page.*, Network.*, Console.*, etc. over the same WebSocket. Poll. No native event stream; clients re-query on an interval.
Request interception Native. The Fetch domain lets you pause, modify, or fulfill any request. Not part of the W3C spec. Reach for the lumabyte:cdp/execute passthrough when you need it.
DOM access model Persistent nodeIds returned by DOM.*. Resolve once, reference many times. Opaque element handles keyed by CSS/XPath, often re-queried per interaction.
Tab ownership Dedicated tabs. Automation spawns puppeteer-kind tabs with a purple accent and a PUP badge in the strip. User tabs stay invisible to the CDP client. Drives any tab. Operates on whatever tab is active in LumaBrowser, including tabs the user is interacting with.
Minimum spec surface ~10 handwritten handlers (Browser, Target, Lumabyte). Everything else forwards to Chromium via webContents.debugger. ~60 handwritten WebDriver endpoints (session lifecycle, elements, cookies, actions, alerts, screenshots, etc.).
Language ecosystem JavaScript/TypeScript-first (Puppeteer). CDP clients exist for Python (pyppeteer, playwright), Go, Rust. Every mainstream language: Python, Java, C#, Ruby, JavaScript, Kotlin, Go, Rust, PHP.
Natural-language elements Lumabyte.find({ description }), Lumabyte.click({ description }). ai-description locator strategy; per-find lumabyte:description hint; /lumabyte/find and /lumabyte/click vendor endpoints.
Fallback configuration Per-CDP-session: Lumabyte.configureFallback({ enabled, onFindFail, onClickIntercepted, slot }). Per-WebDriver-session capability: lumabyte:llmFallback: { enabled, onFindFail, onClickIntercepted, slot }.
Stealth / fingerprinting Lower signal. navigator.webdriver is not set; Network.setUserAgentOverride and Emulation.* are native. Higher signal. WebDriver conformance sets navigator.webdriver; bot-detection heuristics key on it.
Selenium Grid compatibility Not applicable — CDP doesn't participate in Grid. Yes. Set the URL prefix to /wd/hub in settings.
Connecting — the shape of a session

Same browser, same LumaBrowser instance, two different handshakes:

Puppeteer — CDP on 9222
// npm install puppeteer-core
const puppeteer = require('puppeteer-core');

const browser = await puppeteer.connect({
  browserURL: 'http://127.0.0.1:9222',
});

// newPage spawns a purple PUP tab inside
// LumaBrowser — the automation never touches
// the user's regular tabs.
const page = await browser.newPage();
await page.goto('https://example.com');

const title = await page.title();
console.log(title);

await browser.disconnect();
Selenium — WebDriver on 9515
# pip install selenium
from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Remote(
    "http://127.0.0.1:9515",
    options=webdriver.ChromeOptions(),
)

# Operates on whichever tab is active in
# LumaBrowser — the same tabs the user sees.
driver.get("https://example.com")

title = driver.find_element(By.CSS_SELECTOR, "h1").text
print(title)

driver.quit()

Both connections take the same wall time. The Selenium session negotiates capabilities over HTTP; the Puppeteer session negotiates via Browser.getVersion, Target.setDiscoverTargets, and Target.setAutoAttach over the WebSocket.

Natural-language element resolution — side by side

The LumaByte differentiator. Neither vanilla Puppeteer nor vanilla Selenium ships natural-language element resolution with a deterministic-first fallback. Both drivers expose it; the syntax shape is the one thing that differs.

Puppeteer — Lumabyte CDP domain
const cdp = await page.target().createCDPSession();

// Opt in per-session. CDP has no first-class
// capability slot, so it's a method call.
await cdp.send('Lumabyte.configureFallback', {
  enabled: true,
});

// Tries selector first, LLM only on miss.
const r = await cdp.send('Lumabyte.find', {
  description: 'the more information link',
  selector:    'a.cta-deprecated',
});
// { selector: 'a[href*="iana.org"]',
//   strategy: 'description' }

await cdp.send('Lumabyte.click', {
  description: 'the add to cart button',
});
Selenium — capability + vendor strategy
opts = webdriver.ChromeOptions()

# Opt in at session creation.
opts.set_capability("lumabyte:llmFallback", {
    "enabled": True,
})

driver = webdriver.Remote(
    "http://127.0.0.1:9515", options=opts,
)

# Skip CSS entirely.
link = driver.find_element(
    "ai-description",
    "the more information link",
)

# Or add an AI hint to a regular find — the
# LLM only runs if the CSS path misses.
btn = driver.find_element(
    "ai-description",
    "the add to cart button",
)
btn.click()
One round-trip page capture

Both drivers collapse “DOM + accessibility + screenshot” into a single call. Different surfaces, equivalent payload.

Puppeteer — Lumabyte.domSnapshot
const cdp = await page.target().createCDPSession();

const { snapshot, axTree, screenshot } =
  await cdp.send('Lumabyte.domSnapshot', {
    includeAxTree:     true,
    includeScreenshot: true,
  });

// snapshot   — DOMSnapshot.captureSnapshot
// axTree     — Accessibility.getFullAXTree
// screenshot — base64 PNG of the viewport
Selenium — /lumabyte/dom/snapshot
import requests

snap = requests.post(
    f"http://127.0.0.1:9515/session/"
    f"{driver.session_id}/lumabyte/dom/snapshot",
    json={"includeScreenshot": True},
).json()["value"]

# snap["url"], snap["title"], snap["source"],
# snap["screenshot"] (base64 PNG, optional)
Which one should I pick?

The capabilities are equivalent. The decision is about ergonomics and the ecosystem you already live in.

Puppeteer if…

  • You are building agent tooling: LangChain, browser-use, AgentCP, or a custom LLM orchestrator that already speaks CDP or wants event streams.
  • You need request interception — pausing, modifying, or fulfilling fetches via the Fetch domain — without dropping to CDP passthrough tricks.
  • Your workflow listens for push events: network, console, DOM mutations, target lifecycle.
  • Stealth-sensitive automation — scraping flows where navigator.webdriver is a tell you can't afford.
  • You want dedicated automation tabs that don't overlap with the user's workflow — the purple PUP tabs stay out of the way.
  • Your team is comfortable in JavaScript/TypeScript (Puppeteer's native home).

Selenium if…

  • You already have a mature test suite in Selenium, WebdriverIO, Capybara, or Nightwatch. Change one URL and inherit LLM fallback.
  • You need cross-language clients: Python, Java, C#, Ruby, Kotlin, Go, PHP, Rust — WebDriver's ecosystem is simply wider.
  • You route through Selenium Grid and want LumaBrowser to slot in unchanged (set urlPrefix=/wd/hub).
  • You want the user's tabs to be the automation target — useful for record/replay, visual QA of a real browsing session, or observability on what a human has open.
  • Your pipeline depends on standard W3C semantics: goog:chromeOptions pass-through, CSS/XPath/link-text locators, synchronous script execution.

Still unsure? Both drivers are off by default. Enable both in LumaBrowser's settings and try the same flow against each — they run concurrently on different ports without interfering with each other.

Enable either (or both)

1. Settings. Open Settings → Selenium or Settings → Puppeteer, check “Start automatically when LumaBrowser launches”, then click Save & Start.

2. REST.

# Start the Selenium WebDriver server (port 9515)
curl -X POST http://localhost:3000/api/selenium/start

# Start the Puppeteer CDP server (port 9222)
curl -X POST http://localhost:3000/api/puppeteer/start

# Check both
curl http://localhost:3000/api/selenium/status
curl http://localhost:3000/api/puppeteer/status

Full references:

FAQ

Can I run Puppeteer and Selenium against LumaBrowser at the same time?

Yes. The two extensions bind to different ports (Selenium on 9515, Puppeteer on 9222) and the Puppeteer driver creates its own puppeteer-kind tabs rather than sharing the user's tabs, so neither driver collides with the other. Run a Selenium test suite and a Puppeteer agent against the same LumaBrowser instance in parallel.

Do both drivers share the LLM fallback?

Yes. Both drivers route their natural-language-to-selector resolution through the same core selector-resolver slot and the same deterministic-match → LLM → validation pipeline. Improvements on either side ship to both clients.

Which driver is faster?

For single commands, both run in-process with the browser, so neither has noticeable IPC overhead. The CDP driver streams events instead of polling, which makes it faster for workloads that listen for network requests, console logs, or DOM mutations. For happy-path finds and clicks, they are effectively identical.

Is the LLM fallback opt-in for both drivers?

Yes. In Selenium you set the lumabyte:llmFallback capability on the new session. In Puppeteer you call Lumabyte.configureFallback({ enabled: true }) on the CDP session. Either way, the LLM never runs unless you asked for it.

Why does the Puppeteer driver use dedicated tabs?

Chromium's webContents.debugger is an exclusive lock per tab. If Puppeteer attached to the user's regular tabs, LumaBrowser's Network Interceptor, AI Chat browsing tools, and other subsystems would fight for the same handle. Dedicated puppeteer-kind tabs (rendered with a purple accent and PUP badge) sidestep the conflict and make the automation visible while it runs.

Can I use Playwright?

Playwright's CDP transport connects to the Puppeteer Driver the same way puppeteer.connect does — point it at http://127.0.0.1:9222. Playwright's higher-level Selenium-style API is not on the roadmap.

Ready to try it

Install LumaBrowser, enable either driver (or both), and point your existing client at the local port. The natural-language resolver is waiting.