baa-conductor


baa-conductor / plugins / baa-firefox
im_wower  ·  2026-03-22

PLAN.md

  1# BAA Firefox MVP Plan
  2
  3## Goal
  4
  5Build the smallest Firefox extension MVP that proves this model works:
  6
  7- one always-open controller page inside the extension
  8- one real top-level AI tab
  9- manual login only
 10- automatic discovery of API endpoints, auth headers, and streaming events
 11- a long-lived WebSocket from the controller page to `baa-server`
 12- no DOM parsing as the primary path
 13
 14The first platform should be `claude.ai` only. After it works, generalize.
 15
 16## Non-Goals
 17
 18Do not solve these in MVP:
 19
 20- multi-platform support
 21- multiple tabs per platform
 22- multiple accounts
 23- multi-window coordination
 24- iframe embedding
 25- full request execution from server into the AI tab
 26- automatic login
 27- deep DOM automation
 28- generic worker/websocket interception
 29- polished UI
 30
 31## Core Decision
 32
 33Do not keep the main WebSocket in `background.js`.
 34
 35Instead:
 36
 37- `controller.html` stays open in a normal browser tab
 38- `controller.js` owns the WebSocket and the runtime state
 39- `background.js` becomes thin bootstrap and helper logic only
 40
 41Reason:
 42
 43- Firefox MV3 background lifetime is the main risk
 44- a visible extension page is simpler than trying to keep a background context alive
 45- the controller page can directly use WebExtension APIs needed for MVP
 46
 47## MVP Architecture
 48
 49### 1. Controller Page
 50
 51One extension page acts as the control tower.
 52
 53Responsibilities:
 54
 55- open and keep track of the single AI tab
 56- connect to `baa-server` over WS
 57- receive browser-side events from content scripts
 58- persist minimal state to `storage.local`
 59- surface simple status: connected, tab exists, credentials seen, endpoints count
 60
 61The controller page should be opened manually at first.
 62
 63Later, background can auto-open it on startup if needed.
 64
 65### 2. Real AI Tab
 66
 67Use one real top-level tab:
 68
 69- `https://claude.ai`
 70
 71Requirements:
 72
 73- user logs in manually
 74- no specific chat URL required
 75- the root app page is enough
 76
 77The tab is the real browser surface that carries:
 78
 79- cookies
 80- session
 81- same-origin rules
 82- anti-bot browser context
 83
 84### 3. Content Script Bridge
 85
 86Inject a lightweight content script into the AI tab.
 87
 88Responsibilities:
 89
 90- receive `CustomEvent`s from the page context
 91- forward them to the controller page or background
 92
 93This is the bridge between:
 94
 95- privileged extension code
 96- page MAIN world instrumentation
 97
 98### 4. MAIN World Interceptor
 99
100Inject a MAIN world script into the AI tab.
101
102Responsibilities:
103
104- patch `window.fetch`
105- observe request URL, method, headers, body
106- observe response status, headers, body
107- observe SSE chunks if present
108- emit normalized events through `CustomEvent`
109
110This is the primary discovery layer for:
111
112- API endpoints
113- request/response shapes
114- streaming behavior
115
116### 5. Browser-Level Request Observer
117
118Use `webRequest` in parallel.
119
120Responsibilities:
121
122- capture outgoing request headers at browser level
123- recover auth-related headers not visible from page JS
124- build a credential snapshot
125- provide a second signal for endpoint discovery
126
127This is the primary source for:
128
129- cookies
130- CSRF-style headers
131- browser-visible request metadata
132
133## Why This MVP Should Work
134
135This design avoids the fragile path:
136
137- no DOM scraping as the primary integration
138- no button clicking or selector dependency
139- no iframe
140
141This design keeps the stable path:
142
143- real browser tab
144- real cookies
145- real request flow
146- extension-side network observation
147
148The working proof for MVP is simple:
149
150- the user logs into Claude manually
151- the extension sees live endpoint traffic
152- the extension captures auth material
153- the extension sends these to `baa-server`
154
155## Files To Create
156
157The repo should start with these files:
158
159- `manifest.json`
160- `controller.html`
161- `controller.js`
162- `background.js`
163- `content-script.js`
164- `page-interceptor.js`
165- `README.md`
166
167Optional:
168
169- `controller.css`
170
171## File Responsibilities
172
173### `manifest.json`
174
175Firefox-targeted MV3 manifest.
176
177Include only what MVP needs:
178
179- `tabs`
180- `storage`
181- `webRequest`
182- `cookies`
183- `scripting`
184- host permissions for `https://claude.ai/*`
185
186Register:
187
188- `background.js`
189- `content-script.js`
190- `page-interceptor.js`
191- browser action or direct tab page for `controller.html`
192
193### `controller.html`
194
195Minimal UI only.
196
197Must show:
198
199- WS connected / disconnected
200- AI tab exists / missing
201- credentials seen / missing
202- number of discovered endpoints
203- recent log lines
204
205One button is enough:
206
207- `Open Claude Tab`
208
209### `controller.js`
210
211Main runtime owner.
212
213Responsibilities:
214
215- connect WS to `baa-server`
216- create or recover the Claude tab
217- maintain `platformTabId`
218- receive forwarded events
219- normalize and send them to server
220- persist last known state
221
222State shape can be minimal:
223
224```js
225{
226  platform: "claude",
227  tabId: 123,
228  wsConnected: true,
229  lastCredentialAt: 0,
230  endpoints: {},
231  lastHeaders: {},
232  lastEvents: []
233}
234```
235
236### `background.js`
237
238Thin only.
239
240Responsibilities:
241
242- open/focus `controller.html` when extension icon is clicked
243- optionally relay messages if direct page-to-tab wiring is awkward
244- nothing long-lived beyond bootstrap
245
246No main WS should live here in MVP.
247
248### `content-script.js`
249
250Bridge only.
251
252Responsibilities:
253
254- listen for `CustomEvent`s from page context
255- forward to extension runtime
256
257Keep it small.
258
259### `page-interceptor.js`
260
261MAIN world network observer.
262
263Responsibilities:
264
265- wrap `fetch`
266- capture requests
267- capture error responses
268- capture streaming events where possible
269
270For MVP, patching `fetch` is enough.
271
272If Claude moves important traffic elsewhere later, expand after proof.
273
274## Message Flow
275
276### Browser Startup
277
2781. User opens `controller.html`
2792. `controller.js` connects WS to `baa-server`
2803. `controller.js` checks for existing Claude tab
2814. If missing, it opens `https://claude.ai`
2825. Content script and MAIN world interceptor are active on that tab
283
284### Manual Login
285
2861. User logs into Claude in the real tab
2872. Claude app bootstraps and starts making network requests
2883. `webRequest` captures browser-visible auth headers and cookies
2894. MAIN world interceptor captures request and response data
2905. Controller receives both and sends normalized events to `baa-server`
291
292### Endpoint Discovery
293
2941. MAIN world interceptor sees a request
2952. URL path is normalized
2963. Method + normalized path are added to endpoint registry
2974. Controller sends discovered endpoints to `baa-server`
298
299### Credential Snapshot
300
3011. `webRequest` sees request headers
3022. Controller extracts a reduced auth snapshot
3033. Snapshot is timestamped and stored
3044. Snapshot is sent to `baa-server`
305
306## Data Sent To Server
307
308MVP should send only these message types:
309
310- `hello`
311- `endpoint_discovered`
312- `credential_snapshot`
313- `network_event`
314- `sse_event`
315- `status`
316
317Example normalized messages:
318
319```json
320{
321  "type": "hello",
322  "clientId": "ff-xxxxxx",
323  "nodeType": "browser",
324  "nodeCategory": "proxy",
325  "nodePlatform": "firefox"
326}
327```
328
329```json
330{
331  "type": "endpoint_discovered",
332  "platform": "claude",
333  "method": "POST",
334  "path": "/api/organizations/{id}/messages"
335}
336```
337
338```json
339{
340  "type": "credential_snapshot",
341  "platform": "claude",
342  "headers": {
343    "cookie": "...",
344    "x-csrf-token": "...",
345    "anthropic-client-sha": "..."
346  },
347  "ts": 0
348}
349```
350
351## Minimal Acceptance Criteria
352
353MVP is done when all of these are true:
354
355- opening `controller.html` establishes a WS connection to `baa-server`
356- the controller can open or recover exactly one Claude tab
357- manual login in that tab leads to captured credentials
358- at least one live Claude API endpoint is discovered automatically
359- request/response metadata is visible in server logs
360- SSE chunks are visible if Claude uses streaming on the observed path
361- closing the Claude tab and reopening it still restores the flow
362
363## Build Order
364
365Implement in this order:
366
3671. `manifest.json`
3682. `controller.html` + `controller.js`
3693. thin `background.js`
3704. `content-script.js`
3715. `page-interceptor.js`
3726. `webRequest` credential capture
3737. WS message normalization to server
3748. minimal status UI
375
376## Scope Cuts If Time Is Tight
377
378If the first pass needs to be even smaller, cut these first:
379
380- pretty UI
381- SSE chunk parsing details
382- background auto-open behavior
383- storage persistence beyond `tabId` and last headers
384- endpoint normalization beyond very basic ID replacement
385
386Do not cut these:
387
388- controller-owned WS
389- real top-level Claude tab
390- MAIN world fetch interception
391- `webRequest` credential capture
392
393## Deferred Work After MVP
394
395Only after the MVP works:
396
397- add active proxy execution from server into the live AI tab
398- support ChatGPT and Gemini
399- support multiple windows
400- support multi-account contexts
401- add stronger recovery logic
402- add XHR and worker interception
403- make the controller page optional by moving some logic back into a more robust runtime model
404
405## TODO: SSE 抓取
406
407Observed in the current Firefox MVP:
408
409- request / credential / endpoint capture works
410- `POST /api/organizations/{id}/chat_conversations/{id}/completion` is discovered
411- the current SSE bridge can still report `The operation was aborted` with `0` chunks
412- this happened in the anonymous `https://claude.ai/new?incognito` flow after a real prompt/response round trip
413
414Next work should focus on these items:
415
416- reproduce the failure deterministically on both anonymous and logged-in organization chats
417- verify whether `response.clone().body.getReader()` is racing with Claude's own stream consumption in Firefox
418- test replacing the current clone-reader approach with a `ReadableStream.tee()` based interception path
419- keep partial chunk state across boundaries so incomplete `data:` frames are not dropped
420- include stable conversation identifiers in `sse_event` payloads when the completion URL contains them
421- distinguish clean stream completion from abort / navigation / page refresh in emitted events
422- add controller-side logs for first chunk, last chunk, abort reason, and total chunk count
423- only mark SSE capture as done after a verified chunked response is observed in `baa-server` logs
424
425## Final Recommendation
426
427Do not over-design beyond this.
428
429The first goal is only to prove:
430
431- Firefox can keep a controller page open
432- the controller page can keep a WS open
433- a real Claude tab can be observed without DOM parsing
434- endpoints and credentials can be learned and sent to `baa-server`
435
436Once that proof exists, the rest of the system can be evolved safely.