Puppeteer technical reference

Self-Healing Puppeteer via CDP with LLM Selector Fallback

LumaBrowser is a drop-in Chrome DevTools Protocol target for Puppeteer. Its CDP server runs on port 9222 from inside the browser process, so puppeteer.connect({ browserURL }) attaches unchanged and every native Page, Runtime, DOM, Input, Network, and Fetch command forwards through to Chromium. The Lumabyte.* domain adds LLM-powered selector fallback, so brittle CSS self-heals when the DOM shifts underneath your scripts.

Looking for the marketing intro? Start at Self-Healing Selectors — the hub page covers Selenium, Playwright, and Puppeteer side-by-side. This page is the Puppeteer-specific deep dive.

LumaBrowser vs Puppeteer at a glance

Puppeteer is a Node.js automation library that drives a bundled Chromium via CDP. LumaBrowser is the browser itself, exposing the same CDP surface on port 9222 plus an additional Lumabyte.* domain for LLM-powered selector fallback. Your existing puppeteer-core script keeps working; what changes is the launch mechanism and the new capabilities available to the session.

CapabilityPuppeteerLumaBrowser
TransportChrome DevTools ProtocolChrome DevTools Protocol on port 9222
Launch modelpuppeteer.launch() spawns a Chromium child processpuppeteer.connect({ browserURL }) attaches to a running browser
Chromium binary~170 MB bundled per platformNone — LumaBrowser is the browser
Native CDP (Page, Runtime, DOM, Network, Fetch)Full supportFull support — forwarded through webContents.debugger
LLM selector fallbackNot availableLumabyte.find, Lumabyte.click, description-first resolver
DOM + AX tree + screenshotThree separate CDP callsOne call via Lumabyte.domSnapshot
Network request interceptionFetch.* / Network.*Same, plus REST-level Network Watcher with webhook forwarding
MCP server for AI agentsNot providedBuilt-in local MCP server
Dedicated automation tabsProcess-isolated profilePurple-badged CDP-kind tabs live alongside the user's session
Also works with Selenium / WebdriverIONoYes — separate W3C WebDriver server on port 9515
Setup costnpm i puppeteer (downloads Chromium)npx lumabrowser start
The drop-in promise

LumaBrowser’s cdp-driver extension is a standards-compliant CDP WebSocket server on port 9222 — the same port and discovery surface Chromium exposes when launched with --remote-debugging-port=9222. The server implements Browser.*, Target.*, and the custom Lumabyte.* domain itself; every other command (Page, Runtime, DOM, Input, Network, Fetch, Emulation, Log, ~290 in total) forwards straight through Electron’s webContents.debugger to Chromium. What you don’t have to change:

  • No new library. Keep your existing puppeteer, puppeteer-core, Playwright, chrome-remote-interface, pyppeteer, or chromedp client.
  • The HTTP bootstrap surface matches Chromium. GET /json/version returns a webSocketDebuggerUrl. /json and /json/list enumerate targets. PUT /json/new?<url> opens a target. puppeteer.connect({ browserURL }) uses exactly these to discover the WebSocket endpoint.
  • Flat-mode sessions, modern default. Target.setAutoAttach with flatten: true is supported, matching modern Puppeteer’s wire format.
  • Browser contexts isolate state. Target.createBrowserContext returns a context ID that partitions cookies and storage (backed by a persistent persist:cdp-ctx-<id> partition).
  • Native CDP is untouched. Network.setUserAgentOverride, Fetch.enable, Emulation.setDeviceMetricsOverride, DOMSnapshot.captureSnapshot — all pass through to Chromium the same way they would against chrome --headless.

Just change the launch:

Puppeteer (classic)
// Downloads and launches bundled Chromium.
const puppeteer = require('puppeteer');

const browser = await puppeteer.launch({
  headless: 'new',
});

const page = await browser.newPage();
await page.goto('https://example.com');
console.log(await page.title());

await browser.close();
LumaBrowser
// LumaBrowser is already running. Attach.
const puppeteer = require('puppeteer-core');

const browser = await puppeteer.connect({
  browserURL: 'http://127.0.0.1:9222',
});

const page = await browser.newPage();
await page.goto('https://example.com');
console.log(await page.title());

await browser.disconnect();

That’s the entire migration. Swap puppeteer for puppeteer-core (no bundled Chromium), change launch(...) to connect({ browserURL }), point at port 9222. Everything below is what you gain by opting in.

What’s actually different

Three concrete wins. Each one is opt-in: pay the cost only where you want the benefit.

Win 1: Resilient selectors that survive a redesign

Scrapers break when a marketing team renames a CSS class. LumaBrowser’s Lumabyte.* CDP domain retries the find via an LLM using a natural-language description of what the element is — not how it’s currently styled.

Two opt-in modes:

  1. Keep your CSS, add a description fallback. Lumabyte.find({ description, selector }) tries the selector against DOM.querySelector first and only invokes the LLM when it misses. Happy-path scrapes cost nothing extra.
  2. Skip CSS entirely. Lumabyte.find({ description }) or Lumabyte.click({ description }) lets the LLM own resolution end-to-end.
Puppeteer (classic)
// Breaks the moment the button's class
// name changes during a UI refresh.
const puppeteer = require('puppeteer');

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.shop/cart');

await page.waitForSelector(
  'button.btn-primary.signup'
);
await page.click('button.btn-primary.signup');
LumaBrowser
// LLM fallback re-resolves when CSS misses.
const puppeteer = require('puppeteer-core');

const browser = await puppeteer.connect({
  browserURL: 'http://127.0.0.1:9222',
});
const page = await browser.newPage();
await page.goto('https://example.shop/cart');

const cdp = await page.target().createCDPSession();
await cdp.send('Lumabyte.configureFallback', {
  enabled: true,
});

// Option A: keep CSS, add a description fallback.
await cdp.send('Lumabyte.click', {
  description: 'the sign-up button',
  selector:    'button.btn-primary.signup',
});

// Option B: skip CSS entirely.
await cdp.send('Lumabyte.click', {
  description: 'the sign-up button',
});
Win 2: One round-trip to read the whole page

Classic Puppeteer agent loops make three separate CDP round-trips to capture page state: DOMSnapshot.captureSnapshot, then Accessibility.getFullAXTree, then Page.captureScreenshot. LumaBrowser’s Lumabyte.domSnapshot returns all three in a single call, ordered and correlated server-side.

Puppeteer (classic)
// Three CDP round-trips per agent step.
const cdp = await page.target().createCDPSession();
await cdp.send('DOM.enable');
await cdp.send('Accessibility.enable');

const snapshot = await cdp.send(
  'DOMSnapshot.captureSnapshot',
  { computedStyles: [] }
);
const axTree = await cdp.send(
  'Accessibility.getFullAXTree'
);
const { data: screenshot } = await cdp.send(
  'Page.captureScreenshot',
  { format: 'png' }
);
// 3 requests, 3 responses, 3 places to retry.
LumaBrowser
// One call, server-correlated payload.
const cdp = await page.target().createCDPSession();

const { snapshot, axTree, screenshot } =
  await cdp.send('Lumabyte.domSnapshot', {
    includeAxTree:     true,
    includeScreenshot: true,
  });

// snapshot   — DOMSnapshot.captureSnapshot
// axTree     — Accessibility.getFullAXTree
// screenshot — base64 PNG of the viewport
// 1 request. Parse locally.
Win 3: No bundled Chromium, dedicated automation tabs

Classic puppeteer ships a ~170 MB Chromium binary per platform and launches it as a child process for every script run. That binary drifts behind Chrome stable, your CI downloads it on every cold cache, and the --user-data-dir left behind needs cleanup. LumaBrowser’s CDP server runs inside the browser process itself — there is no binary to download, no launcher lifecycle, no temp profile to reap.

And unlike attaching to a user’s regular Chrome (where your automation would fight for the same tabs the user is browsing in), LumaBrowser gives the CDP client dedicated automation tabs: only tabs created with kind: 'cdp' surface as CDP targets, rendered in the tab strip with a purple accent and a CDP badge so you can watch the session drive them live.

Puppeteer (classic)
// puppeteer ships a bundled Chromium and
// spawns it as a child process on every run.
const puppeteer = require('puppeteer');

const browser = await puppeteer.launch({
  headless: 'new',
  args: ['--no-sandbox'],
  userDataDir: '/tmp/pup-profile',
});

// Plus: Chromium version drift, ~170MB
// per-platform binary, child-process cleanup,
// stale --user-data-dir trees, and the
// occasional "Failed to launch the browser
// process!" at 3am.
LumaBrowser
// LumaBrowser is already running. Attach.
const puppeteer = require('puppeteer-core');

const browser = await puppeteer.connect({
  browserURL: 'http://127.0.0.1:9222',
});

// Dedicated "CDP" tabs, visible in the strip.
// No binary download.
// No child process.
// No --user-data-dir to clean up.
// The user's regular tabs stay invisible to
// the CDP client, so Network Interceptor and
// AI Chat don't fight for the debugger lock.
Full example: a semantic end-to-end scrape

What a realistic flow looks like when you mix native CDP with Lumabyte.*. Notice how close the code reads to the task description.

const puppeteer = require('puppeteer-core');

// 1. Attach to the already-running browser.
const browser = await puppeteer.connect({
  browserURL: 'http://127.0.0.1:9222',
});
const page = await browser.newPage();
await page.goto('https://example.shop/products/coffee-grinder');

const cdp = await page.target().createCDPSession();

// 2. Opt in to LLM fallback for this session.
await cdp.send('Lumabyte.configureFallback', {
  enabled:            true,
  onFindFail:         true,   // retry failed finds via LLM
  onClickIntercepted: true,   // retry intercepted clicks via LLM
});

// 3. Describe elements, don't select them.
await cdp.send('Lumabyte.click', { description: 'the add-to-cart button' });
await cdp.send('Lumabyte.click', { description: 'the cart icon in the header' });

// 4. Pull DOM + AX tree + screenshot in one round-trip.
const { snapshot, axTree, screenshot } = await cdp.send('Lumabyte.domSnapshot', {
  includeAxTree:     true,
  includeScreenshot: true,
});

// 5. Native CDP still works — verify via Runtime.evaluate.
const { result } = await cdp.send('Runtime.evaluate', {
  expression:    'document.title',
  returnByValue: true,
});
console.log(result.value);

await browser.disconnect();

Twenty-odd lines, zero brittle CSS, one round-trip to capture the page state for your agent loop or visual diff. Try writing the same flow in classic Puppeteer and count the selectors you’d need to maintain next quarter.

Compatibility matrix

The full row-by-row CDP matrix — Browser, Target, Page, Runtime, DOM, Input, Network, Fetch, Emulation, DOMSnapshot, browser contexts, plus the additive Lumabyte.* methods (find, click, domSnapshot, configureFallback, getInfo) — lives in the API reference: /apis → CDP Driver section. Every native CDP domain forwards straight through webContents.debugger to Chromium; the Lumabyte.* domain is purely additive, so existing Puppeteer code ignores it and keeps running.

How to enable it

Two ways, takes under a minute either way:

1. From the UI

  1. Open Settings → CDP Driver.
  2. (Optional) Check “Start automatically when LumaBrowser launches.”
  3. Click Save & Start.
  4. Point your Puppeteer client at http://127.0.0.1:9222.

Or start it programmatically: POST /api/cdp/start. Full CDP domain reference, Lumabyte.* method signatures, host/port settings, and MCP tool surface are in the CDP Driver section of the API docs.

FAQ

Does my existing Puppeteer script just work?

Yes. LumaBrowser implements the full CDP bootstrap surface (/json/version, /json/list, /json/new, /devtools/browser/{uuid}) and forwards every non-Lumabyte command straight through Electron’s webContents.debugger to Chromium. puppeteer.connect({ browserURL: 'http://127.0.0.1:9222' }) attaches the same way it does to any Chrome instance launched with --remote-debugging-port. Swap puppeteer for puppeteer-core (no bundled Chromium needed) and change launch() to connect({ browserURL }).

Can I use Playwright instead?

Yes. Playwright’s connect_over_cdp transport speaks the same protocol — point it at http://127.0.0.1:9222 and it attaches exactly like puppeteer.connect. Any CDP client works: chrome-remote-interface in Node, pyppeteer in Python, chromedp in Go, anything that speaks CDP over WebSocket. The Lumabyte.* domain is available from all of them via the generic send(method, params) entry point.

Which LLM runs the Lumabyte.find fallback?

The fallback routes through whichever model you configure in the shared core-scope selector-resolver slot. Configure it once and every description-based call from either the Puppeteer or the Selenium driver benefits — same orchestrator, same pipeline. If the slot is unconfigured, the resolver falls back to your global active LLM. You can point it at Anthropic, any OpenAI-compatible endpoint, or a local model via LM Studio or Ollama.

What about Selenium or Playwright?

LumaBrowser ships a separate WebDriver server on port 9515 alongside the CDP server, and Playwright’s chromium.connectOverCDP('http://127.0.0.1:9222') attaches to the same CDP endpoint as Puppeteer. See Selenium, Playwright, or the cross-driver hub for when to pick each.

Ready to try it

Install LumaBrowser, enable the cdp-driver extension, and swap puppeteer.launch() for puppeteer.connect({ browserURL: 'http://127.0.0.1:9222' }). Every native CDP command keeps working. The Lumabyte.* domain is there when you want it.