██████╗░░█████╗░███╗░░░███╗░░██████╗░░█████╗░██████╗░██████╗░██╗░░░██╗
██╔══██╗██╔══██╗████╗░████║░░██╔══██╗██╔══██╗██╔══██╗██╔══██╗╚██╗░██╔╝
██║░░██║██║░░██║██╔████╔██║░░██║░░██║███████║██║░░██║██║░░██║░╚████╔╝░
██║░░██║██║░░██║██║╚██╔╝██║░░██║░░██║██╔══██║██║░░██║██║░░██║░░╚██╔╝░░
██████╔╝╚█████╔╝██║░╚═╝░██║░░██████╔╝██║░░██║██████╔╝██████╔╝░░░██║░░░
╚═════╝░░╚════╝░╚═╝░░░░░╚═╝░░╚═════╝░╚═╝░░╚═╝╚═════╝░╚═════╝░░░░╚═╝░░░
E X T R A C T A N Y T H I N G
by Trollefsen Labs
A Manifest V3 Chrome extension that extracts structured data from sites that fight scraping. Currently:
- ChatGPT, Claude, Gemini, AI Studio, Perplexity chats → Markdown / Text / JSON / CSV
- LinkedIn experience pages (
/in/{you}/details/experience/) → Markdown / JSON / CSV (one row per role)
Pure client-side. No server, no build step, no analytics, no dependencies.
- Open
chrome://extensions. - Toggle Developer mode on (top-right).
- Click Load unpacked and select the
extension/directory in this repo. - Pin the extension to the toolbar.
- Open a supported page, click the icon, choose a format.
For LinkedIn specifically: open your (or another user's) profile, click Show all experience, and run DOM Daddy on the resulting …/details/experience/ page. The popup will offer a "Take me there" shortcut if you click it on the wrong sub-page.
Works on any Chromium browser (Chrome, Edge, Brave, Arc, etc.).
extension/
manifest.json
src/
background/ Thin service worker (lifecycle hooks only)
content/ One extractor per supported host
chatgpt.js claude.js gemini.js aistudio.js perplexity.js
linkedin.js /in/{slug}/details/experience/
lib/
schema.js Conversation + Profile types (kind discriminator)
markdown.js HTML -> Markdown converter (no deps)
exporters/
exporters.js export{Markdown,Text,JSON,CSV} for conversations
export{ProfileMarkdown,ProfileJSON,ProfileCSV} for profiles
popup/
popup.html / .css / .js User-facing UI; branches on result kind
icons/
Popup opened
-> popup.js sends { type: 'EXTRACT' } to the active tab
-> matching content script returns { kind, ...data }
-> popup branches on kind, runs the right exporter
-> chrome.downloads delivers the file
The shared schema (extension/src/lib/schema.js) is the contract between extractors and exporters. Two shapes today: Conversation (kind: 'conversation') and Profile (kind: 'profile').
MV3 doesn't allow content scripts to be declared as ES modules. To still share markdown.js and schema.js across extractors, each content script does await import(chrome.runtime.getURL('src/lib/...js')). The shared files are listed under web_accessible_resources so they're loadable from the page context.
- Selectors drift. When a site reorganizes, only that site's content script needs to change. Stable anchors:
- ChatGPT:
[data-message-author-role],[data-message-id] - Claude:
[data-testid="user-message"],.font-claude-message - Gemini:
user-query,model-response(Angular component tags) - AI Studio:
ms-chat-turn(role fromdata-turn-role); content inms-cmark-node - Perplexity:
[class~="group/query"]for queries,.prosefor answers - LinkedIn:
[componentkey^="entity-collection-item-"]per company entry; we parseinnerTextline-by-line and ignore hashed CSS classes entirely.
- ChatGPT:
- Virtualized / collapsed UIs lose data. Chat sites unmount off-screen messages; "Show thinking" details and LinkedIn's
…see moremay collapse content. Scroll/expand before extracting. - LinkedIn
+N skills. The "+N skills" overflow on roles can't be pulled without clicking the chip — we capture the visible skills and store the hidden count ashiddenSkillCount. - Canvas / Artifacts (ChatGPT side panel, Claude artifacts) aren't currently captured.
- No real chat timestamps. None of the chat hosts expose creation date or per-message timestamps in the DOM, so the date in the filename is the export date.
Conversations: {source}-YYYYMMDD-{sessionId}.{ext}. Profiles: {source}-{slug}-YYYYMMDD.{ext}.
If the Save As dialog shows a different filename than what we suggested, another installed extension is hooking chrome.downloads.onDeterminingFilename and overwriting our suggestion (Chrome only honors the most recently installed listener — there's no override). The popup's green "Saved …" status surfaces the real on-disk filename, so you can compare.
Known offender: Suno Tracks Exporter. Disable it (or any other download-manager extension) while exporting if you need the suggested filename to land.
- Add the host to
manifest.json(host_permissions,content_scripts,web_accessible_resources). - Add a
SITESentry inextension/src/popup/popup.jswithkind: 'conversation'. - Copy an existing chat extractor and rewrite its selector block.
- Same manifest updates, plus a
pageReady/pageHintif the data lives on a sub-page. - New extractor returning
makeProfile(...)(or a new schema shape if the data isn't profile-like). - New exporters in
exporters.jsif a new kind needs different output formats.
Add an exportXxx(data, opts) function in exporters.js returning { filename, blob }, then wire a button in popup.html and a case in runExport() in popup.js.
Code is licensed under the Apache License 2.0 (see LICENSE) — you can use, modify, and redistribute it commercially, subject to the usual Apache obligations.
The name DOM Daddy, the mascot, and all files under extension/icons/ plus favicon.ico are © Trollefsen Labs and are not covered by the code license. Forks must rename and re-skin before publishing to the Chrome Web Store or any other distribution channel. Full carve-out in NOTICE.
A from-scratch alternative to closed-source export extensions, with the goal of (1) keeping all extracted data inside the browser and (2) being trivially auditable — no minified bundles, no server, no analytics.