-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[Feature]: unify network interception across explore, diagnostic, and operate #810
Description
End-to-end tested the full adapter lifecycle (explore → generate → plugin → operate → diagnostic → repair) on v1.6.7. The individual pieces mostly work, but they don't share browser instrumentation, which limits the AI agent story.
All claims below have been verified by both static code analysis and runtime reproduction on v1.6.7.
What was tested
| Step | Result | Notes |
|---|---|---|
| explore | Runs, but 0 API endpoints on HN and Lobsters | Compensation strategy (iframe re-fetch) exists but rarely triggers |
| generate | Fails (0 candidates) | Depends on explore finding APIs; confirmed: synthesize hackernews → 0 |
| plugin create | Scaffold OK | YAML template uses nonexistent extract pipeline step (separate bug) |
| plugin install (TS) | Compile OK, runtime OK | linkHostOpencli() in plugin.ts already creates node_modules/@jackwener/opencli symlink |
| operate open/state/eval | All work | |
| operate network | 0 requests after open | Interceptor injected after goto + wait, misses initial load. Confirmed at runtime. |
| operate network (post-eval) | Works | Manual fetch after open is captured correctly (visible with --all) |
| operate init + verify | Works | Template uses correct package exports |
| diagnostic (TS) | Works | error.code and source both correct |
| diagnostic (YAML) | source missing | Covered by #808 |
| diagnostic (browser page state) | Minimal info | networkRequests 0, consoleErrors [] (see notes below) |
| repair loop (manual) | Each step works individually | diagnostic → operate explore → patch → verify all functional |
Note on
consoleErrors: []:base-page.tsconsoleMessages()returns an empty array unconditionally with no subclass override. So diagnostic'sconsoleErrors: []does not mean "no errors on page"; it means console errors are not collected at all. This further weakens diagnostic info quality.
Core finding: fragmented network interception paths
The codebase has at least four independent network capture mechanisms (three relevant to the explore/diagnostic/operate story, plus one in record):
| System | Global var | Used by | Captures |
|---|---|---|---|
NETWORK_INTERCEPTOR_JS (cli.ts:310) |
__opencli_net |
operate | fetch/XHR with method, status, body. Injected after page load, so misses initial requests. |
interceptor.ts → installInterceptor() |
__opencli_xhr |
pipeline intercept step | fetch/XHR with body. Injected before trigger, so no timing gap. |
generateTapInterceptorJs |
temporary | pipeline tap step | One-shot capture, restores original functions after. |
record.ts → generateRecordInterceptorJs |
__opencli_record |
record | fetch/XHR with method, status, headers, request/response body. Idempotent re-injection with restore. |
diagnostic and explore use none of these. They both go through page.networkRequests() → performance.getEntriesByType('resource'), which returns URL and timing metadata but no method, status, or response body.
The key gap: when a pipeline adapter fails, the intercept step may have already captured rich network data into window.__opencli_xhr, but diagnostic.ts only reads page.networkRequests() (the performance API path), not page.getInterceptedRequests().
Suggested improvements (by impact/cost ratio)
High impact, low cost:
- diagnostic: also read
page.getInterceptedRequests()when available. Pipeline's intercept step may have already captured the data that diagnostic needs. One additional call incollectPageState. - diagnostic: implement real
consoleMessages()collection (currently base class returns[]unconditionally), or at minimum document that console errors are not captured yet.
Medium impact, medium cost:
- operate open: inject
NETWORK_INTERCEPTOR_JSbefore or duringgoto()instead of after, so initial page-load requests are captured. Or use CDP-levelFetch.requestPaused/Network.requestWillBeSentwhich don't have timing issues. - explore: use
page.installInterceptor()(already available on IPage) instead of relying solely onperformance.getEntriesByType+ iframe re-fetch. This would give explore the same quality network data that pipeline intercept steps get.
Separate bug (not part of network unification):
- plugin create: fix YAML template to use a valid pipeline step (replace
extractwithevaluateormap). Seeplugin-scaffold.ts:91-98.
Context
These all feed into the vision where each user's AI agent can autonomously explore a website, generate an adapter, and self-repair when it breaks. The operate path (open → state → eval → init → verify) already works end-to-end today. The automated path (explore → generate) and the repair path (diagnostic → fix) are blocked primarily by network data quality.
Note: #806 already added safety boundaries to diagnostic output (redaction of secrets, size caps, collection timeout, source path resolution). The remaining gap is purely about data richness — getting method/status/body into explore and diagnostic, not safety.