Notebooks Web Clipper One-click capture of the page you're viewing (any browser) β†’ Notebooks app (HTML note + headless-rendered PDF), named by a local model

A local, one-click web clipper that captures the page you’re viewing in any browser (Safari, Chrome, Firefox, Brave, Arc…) and files it into the Notebooks app (Alfons Schmid) as an editable HTML note plus an attached PDF (the visual fallback for images the HTML doesn’t carry). Built 2026-06-06; rewritten 2026-06-07 (v2) to remove the dependency on a specially-launched Brave with a remote-debugging port β€” clipping is now browser-agnostic.

How it works (v2)

  1. Trigger the clip β€” click the extension’s toolbar button (recommended) or the bookmarklet (fallback) in any browser. Either runs in the page’s own context, so it sees the fully-rendered, logged-in DOM. It clones the document, strips scripts / iframes / video / fixed-sticky overlays / cookie-ad-modal junk, and POSTs the cleaned HTML to the local service (POST localhost:8787/clip?u=<url>&t=<title>, body = HTML).
  2. The clip service receives the HTML β€” no browser automation, no debug port, no specific browser required.
  3. It renders a faithful PDF by shelling out to a headless Chromium binary (Google Chrome preferred; Chromium/Edge/Brave fallback) purely as an offline --print-to-pdf renderer with a throwaway profile. This is independent of whatever browser you clipped from and runs fine alongside an already-open browser.
  4. It asks your local LM Studio model for a concise, descriptive filename-title (falls back to the cleaned page title).
  5. It writes the HTML note (rendered page + injected <base href> + a top banner linking the PDF and source) into the Notebooks Clipper book, saves the PDF to a stable Syncthing folder, updates Clippings Index.html, and pops a macOS banner "πŸ“Ž Clipped β†’ Notebooks".

Headless render quirk (handled): on macOS, headless Chromium writes the PDF in ~1–3 s but frequently will not exit. The service therefore launches it in its own process group, polls for a stable output PDF, then hard-kills the group β€” it never blocks on the process exiting, and leaves no orphan helper processes. Brave’s headless mode does not reliably emit --print-to-pdf on macOS (Shields/rewards services interfere), so Chrome is preferred as the renderer.

Why PDFs live outside the Notebooks library: the iCloud-managed Notebooks container silently deletes loose PDFs (only the newest survived in testing β€” data loss). PDFs are therefore stored in ~/Sync/ED/notebooks-pdfs/ (Syncthing, full local copies, never evicted) and linked from each note’s top banner via a localhost:8787/open link. They don’t appear as documents inside Notebooks, but they persist and the link opens them in Preview.

Components

Piece Path / detail
Clip service /Users/bee/Sync/ED/scripts/clipper/clip_service.py (Python, v2 β€” POST-based, no CDP)
Service venv /Users/bee/venvs/clipper
Service agent ~/Library/LaunchAgents/me.edmd.clipper.plist β€” KeepAlive, listens 127.0.0.1:8787
PDF renderer headless Google Chrome (auto-detected; Chromium/Edge/Brave fallback) β€” offline --print-to-pdf, throwaway profile
Bookmarklet source ~/Sync/ED/scripts/clipper/bookmarklet.js (paste as a bookmark URL)
Naming model LM Studio at localhost:1234; picks first non-embedding chat model (prefers an a3b for speed)
Notes + index ~/Library/Mobile Documents/iCloud~com~aschmid~notebooks/Documents/Clipper/ (HTML notes + Clippings Index.html)
PDFs ~/Sync/ED/notebooks-pdfs/ (Syncthing β€” not the iCloud library, which deletes them)
PDF reaper ~/scripts/clip-reap.py + me.edmd.clip-reap (every 6 h) β€” deletes orphan PDFs
Logs ~/Sync/ED/scripts/clipper/clip.log (clips), service.out.log / service.err.log (process)

Retired in v2: me.edmd.brave-debug.plist (login agent that force-launched Brave with --remote-debugging-port=9222) β€” booted out and archived as me.edmd.brave-debug.plist.retired-20260607. The Launch Brave (clipper).command is likewise obsolete (Brave no longer needs the debug port). The previous (CDP) service is backed up as clip_service.py.bak.v1-cdp-*.

~/Sync/ED/scripts/clipper/extension/ is a minimal MV3 extension: a toolbar button (honey paperclip icon) that captures the current page and POSTs it to the service. Preferred over the bookmarklet β€” real custom icon, and the service-worker fetch to localhost avoids the bookmarklet’s http-from-https quirks.

  • Load (Brave/Chrome): brave://extensions (or chrome://extensions) β†’ enable Developer mode β†’ Load unpacked β†’ select ~/Sync/ED/scripts/clipper/extension/. Pin it to the toolbar.
  • Use: click the button on any page. A green βœ“ badge = clipped; βœ— = capture/HTTP problem; βœ— on chrome:///store pages (can’t script those); ! = service not running on :8787.
  • Hotkey: βŒ˜β‡§Y by default (_execute_action command). Change/assign it at brave://extensions/shortcuts if it conflicts or doesn’t take.
  • Permissions: activeTab + scripting (only touches the tab when you click) and host_permissions: http://localhost:8787/* (to reach the local service).
  • Icon source: extension/icon.svg β†’ rendered to extension/icons/icon{16,32,48,128}.png via rsvg-convert.

Bookmarklet (v2 β€” fallback)

Alternatively, paste the contents of ~/Sync/ED/scripts/clipper/bookmarklet.js as the URL of a new bookmark (name it πŸ“Ž). Works in any browser; clipping the page you’re on POSTs the live HTML to the service. (Replaces the old no-cors GET bookmarklet β€” reinstall it; the old one hits /clip with no body and the service will pop a “reinstall the bookmarklet” notice.)

Operations

  • Restart the service: launchctl kickstart -k gui/$(id -u)/me.edmd.clipper
  • Is it listening? lsof -nP -iTCP:8787 -sTCP:LISTEN
  • Watch clips: tail -f ~/Sync/ED/scripts/clipper/clip.log
  • Stop / start agent: launchctl bootout|bootstrap gui/$(id -u) ~/Library/LaunchAgents/me.edmd.clipper.plist
  • Manual test clip: curl -X POST 'http://localhost:8787/clip?u=https://example.com&t=Test' -H 'Content-Type: text/plain' --data-binary @somepage.html

Troubleshooting

  • pdf=NO in clip.log / PDF-FAIL β†’ the headless renderer didn’t emit a PDF. Confirm a Chromium binary exists (Chrome preferred); Brave alone won’t render reliably. The note is still saved without the PDF.
  • Generic / un-smart names β†’ LM Studio server not running or no model loaded (curl localhost:1234/v1/models). The clip still succeeds; it falls back to the cleaned page title. Logged as LLM-NAME-FAIL.
  • Nothing happens on click β†’ service down (launchctl list | grep clipper), or the browser blocked the http://localhost fetch from an https page. Chromium-family and Firefox allow loopback; if Safari blocks it the bookmarklet pops a “Clip failed to reach service” alert.
  • “Clipper updated β€” reinstall the bookmarklet” banner β†’ you’re still using the old GET bookmarklet; replace it with the v2 one above.

Security note

The service binds to 127.0.0.1:8787 only and accepts a page’s HTML over POST. No browser remote-debugging port is exposed anymore (that was the v1 design and has been retired). The headless renderer runs with a throwaway --user-data-dir, so it never touches your real browser profile.

  • Dictation pipeline (process-dictation) is the audio analogue; this is the web analogue.