im_wower
·
2026-03-22
PLAN.md
1# BAA Firefox MVP Plan
2
3## Goal
4
5Build the smallest Firefox extension MVP that proves this model works:
6
7- one always-open controller page inside the extension
8- one real top-level AI tab
9- manual login only
10- automatic discovery of API endpoints, auth headers, and streaming events
11- a long-lived WebSocket from the controller page to `baa-server`
12- no DOM parsing as the primary path
13
14The first platform should be `claude.ai` only. After it works, generalize.
15
16## Non-Goals
17
18Do not solve these in MVP:
19
20- multi-platform support
21- multiple tabs per platform
22- multiple accounts
23- multi-window coordination
24- iframe embedding
25- full request execution from server into the AI tab
26- automatic login
27- deep DOM automation
28- generic worker/websocket interception
29- polished UI
30
31## Core Decision
32
33Do not keep the main WebSocket in `background.js`.
34
35Instead:
36
37- `controller.html` stays open in a normal browser tab
38- `controller.js` owns the WebSocket and the runtime state
39- `background.js` becomes thin bootstrap and helper logic only
40
41Reason:
42
43- Firefox MV3 background lifetime is the main risk
44- a visible extension page is simpler than trying to keep a background context alive
45- the controller page can directly use WebExtension APIs needed for MVP
46
47## MVP Architecture
48
49### 1. Controller Page
50
51One extension page acts as the control tower.
52
53Responsibilities:
54
55- open and keep track of the single AI tab
56- connect to `baa-server` over WS
57- receive browser-side events from content scripts
58- persist minimal state to `storage.local`
59- surface simple status: connected, tab exists, credentials seen, endpoints count
60
61The controller page should be opened manually at first.
62
63Later, background can auto-open it on startup if needed.
64
65### 2. Real AI Tab
66
67Use one real top-level tab:
68
69- `https://claude.ai`
70
71Requirements:
72
73- user logs in manually
74- no specific chat URL required
75- the root app page is enough
76
77The tab is the real browser surface that carries:
78
79- cookies
80- session
81- same-origin rules
82- anti-bot browser context
83
84### 3. Content Script Bridge
85
86Inject a lightweight content script into the AI tab.
87
88Responsibilities:
89
90- receive `CustomEvent`s from the page context
91- forward them to the controller page or background
92
93This is the bridge between:
94
95- privileged extension code
96- page MAIN world instrumentation
97
98### 4. MAIN World Interceptor
99
100Inject a MAIN world script into the AI tab.
101
102Responsibilities:
103
104- patch `window.fetch`
105- observe request URL, method, headers, body
106- observe response status, headers, body
107- observe SSE chunks if present
108- emit normalized events through `CustomEvent`
109
110This is the primary discovery layer for:
111
112- API endpoints
113- request/response shapes
114- streaming behavior
115
116### 5. Browser-Level Request Observer
117
118Use `webRequest` in parallel.
119
120Responsibilities:
121
122- capture outgoing request headers at browser level
123- recover auth-related headers not visible from page JS
124- build a credential snapshot
125- provide a second signal for endpoint discovery
126
127This is the primary source for:
128
129- cookies
130- CSRF-style headers
131- browser-visible request metadata
132
133## Why This MVP Should Work
134
135This design avoids the fragile path:
136
137- no DOM scraping as the primary integration
138- no button clicking or selector dependency
139- no iframe
140
141This design keeps the stable path:
142
143- real browser tab
144- real cookies
145- real request flow
146- extension-side network observation
147
148The working proof for MVP is simple:
149
150- the user logs into Claude manually
151- the extension sees live endpoint traffic
152- the extension captures auth material
153- the extension sends these to `baa-server`
154
155## Files To Create
156
157The repo should start with these files:
158
159- `manifest.json`
160- `controller.html`
161- `controller.js`
162- `background.js`
163- `content-script.js`
164- `page-interceptor.js`
165- `README.md`
166
167Optional:
168
169- `controller.css`
170
171## File Responsibilities
172
173### `manifest.json`
174
175Firefox-targeted MV3 manifest.
176
177Include only what MVP needs:
178
179- `tabs`
180- `storage`
181- `webRequest`
182- `cookies`
183- `scripting`
184- host permissions for `https://claude.ai/*`
185
186Register:
187
188- `background.js`
189- `content-script.js`
190- `page-interceptor.js`
191- browser action or direct tab page for `controller.html`
192
193### `controller.html`
194
195Minimal UI only.
196
197Must show:
198
199- WS connected / disconnected
200- AI tab exists / missing
201- credentials seen / missing
202- number of discovered endpoints
203- recent log lines
204
205One button is enough:
206
207- `Open Claude Tab`
208
209### `controller.js`
210
211Main runtime owner.
212
213Responsibilities:
214
215- connect WS to `baa-server`
216- create or recover the Claude tab
217- maintain `platformTabId`
218- receive forwarded events
219- normalize and send them to server
220- persist last known state
221
222State shape can be minimal:
223
224```js
225{
226 platform: "claude",
227 tabId: 123,
228 wsConnected: true,
229 lastCredentialAt: 0,
230 endpoints: {},
231 lastHeaders: {},
232 lastEvents: []
233}
234```
235
236### `background.js`
237
238Thin only.
239
240Responsibilities:
241
242- open/focus `controller.html` when extension icon is clicked
243- optionally relay messages if direct page-to-tab wiring is awkward
244- nothing long-lived beyond bootstrap
245
246No main WS should live here in MVP.
247
248### `content-script.js`
249
250Bridge only.
251
252Responsibilities:
253
254- listen for `CustomEvent`s from page context
255- forward to extension runtime
256
257Keep it small.
258
259### `page-interceptor.js`
260
261MAIN world network observer.
262
263Responsibilities:
264
265- wrap `fetch`
266- capture requests
267- capture error responses
268- capture streaming events where possible
269
270For MVP, patching `fetch` is enough.
271
272If Claude moves important traffic elsewhere later, expand after proof.
273
274## Message Flow
275
276### Browser Startup
277
2781. User opens `controller.html`
2792. `controller.js` connects WS to `baa-server`
2803. `controller.js` checks for existing Claude tab
2814. If missing, it opens `https://claude.ai`
2825. Content script and MAIN world interceptor are active on that tab
283
284### Manual Login
285
2861. User logs into Claude in the real tab
2872. Claude app bootstraps and starts making network requests
2883. `webRequest` captures browser-visible auth headers and cookies
2894. MAIN world interceptor captures request and response data
2905. Controller receives both and sends normalized events to `baa-server`
291
292### Endpoint Discovery
293
2941. MAIN world interceptor sees a request
2952. URL path is normalized
2963. Method + normalized path are added to endpoint registry
2974. Controller sends discovered endpoints to `baa-server`
298
299### Credential Snapshot
300
3011. `webRequest` sees request headers
3022. Controller extracts a reduced auth snapshot
3033. Snapshot is timestamped and stored
3044. Snapshot is sent to `baa-server`
305
306## Data Sent To Server
307
308MVP should send only these message types:
309
310- `hello`
311- `endpoint_discovered`
312- `credential_snapshot`
313- `network_event`
314- `sse_event`
315- `status`
316
317Example normalized messages:
318
319```json
320{
321 "type": "hello",
322 "clientId": "ff-xxxxxx",
323 "nodeType": "browser",
324 "nodeCategory": "proxy",
325 "nodePlatform": "firefox"
326}
327```
328
329```json
330{
331 "type": "endpoint_discovered",
332 "platform": "claude",
333 "method": "POST",
334 "path": "/api/organizations/{id}/messages"
335}
336```
337
338```json
339{
340 "type": "credential_snapshot",
341 "platform": "claude",
342 "headers": {
343 "cookie": "...",
344 "x-csrf-token": "...",
345 "anthropic-client-sha": "..."
346 },
347 "ts": 0
348}
349```
350
351## Minimal Acceptance Criteria
352
353MVP is done when all of these are true:
354
355- opening `controller.html` establishes a WS connection to `baa-server`
356- the controller can open or recover exactly one Claude tab
357- manual login in that tab leads to captured credentials
358- at least one live Claude API endpoint is discovered automatically
359- request/response metadata is visible in server logs
360- SSE chunks are visible if Claude uses streaming on the observed path
361- closing the Claude tab and reopening it still restores the flow
362
363## Build Order
364
365Implement in this order:
366
3671. `manifest.json`
3682. `controller.html` + `controller.js`
3693. thin `background.js`
3704. `content-script.js`
3715. `page-interceptor.js`
3726. `webRequest` credential capture
3737. WS message normalization to server
3748. minimal status UI
375
376## Scope Cuts If Time Is Tight
377
378If the first pass needs to be even smaller, cut these first:
379
380- pretty UI
381- SSE chunk parsing details
382- background auto-open behavior
383- storage persistence beyond `tabId` and last headers
384- endpoint normalization beyond very basic ID replacement
385
386Do not cut these:
387
388- controller-owned WS
389- real top-level Claude tab
390- MAIN world fetch interception
391- `webRequest` credential capture
392
393## Deferred Work After MVP
394
395Only after the MVP works:
396
397- add active proxy execution from server into the live AI tab
398- support ChatGPT and Gemini
399- support multiple windows
400- support multi-account contexts
401- add stronger recovery logic
402- add XHR and worker interception
403- make the controller page optional by moving some logic back into a more robust runtime model
404
405## TODO: SSE 抓取
406
407Observed in the current Firefox MVP:
408
409- request / credential / endpoint capture works
410- `POST /api/organizations/{id}/chat_conversations/{id}/completion` is discovered
411- the current SSE bridge can still report `The operation was aborted` with `0` chunks
412- this happened in the anonymous `https://claude.ai/new?incognito` flow after a real prompt/response round trip
413
414Next work should focus on these items:
415
416- reproduce the failure deterministically on both anonymous and logged-in organization chats
417- verify whether `response.clone().body.getReader()` is racing with Claude's own stream consumption in Firefox
418- test replacing the current clone-reader approach with a `ReadableStream.tee()` based interception path
419- keep partial chunk state across boundaries so incomplete `data:` frames are not dropped
420- include stable conversation identifiers in `sse_event` payloads when the completion URL contains them
421- distinguish clean stream completion from abort / navigation / page refresh in emitted events
422- add controller-side logs for first chunk, last chunk, abort reason, and total chunk count
423- only mark SSE capture as done after a verified chunked response is observed in `baa-server` logs
424
425## Final Recommendation
426
427Do not over-design beyond this.
428
429The first goal is only to prove:
430
431- Firefox can keep a controller page open
432- the controller page can keep a WS open
433- a real Claude tab can be observed without DOM parsing
434- endpoints and credentials can be learned and sent to `baa-server`
435
436Once that proof exists, the rest of the system can be evolved safely.