Skip to content

Commit 9a3081d

Browse files
Merge branch 'pr/Eric-Guo/245'
2 parents 1071270 + bd6448e commit 9a3081d

File tree

10 files changed

+778
-53
lines changed

10 files changed

+778
-53
lines changed

docs/external-dependencies.md

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
| 11 | BigQuery Metrics | `api.anthropic.com/api/claude_code/metrics` | HTTPS | 默认启用 |
2020
| 12 | MCP Proxy | `mcp-proxy.anthropic.com` | HTTPS+WS | 使用 MCP 工具时 |
2121
| 13 | MCP Registry | `api.anthropic.com/mcp-registry` | HTTPS | 查询 MCP 服务器时 |
22-
| 14 | Bing Search | `www.bing.com` | HTTPS | WebSearch 工具 |
22+
| 14 | Web Search Pages | `www.bing.com`, `search.brave.com` | HTTPS | WebSearch 工具,可通过 `WEB_SEARCH_ADAPTER=bing|brave` 切换 |
2323
| 15 | Google Cloud Storage (更新) | `storage.googleapis.com` | HTTPS | 版本检查 |
2424
| 16 | GitHub Raw (Changelog/Stats) | `raw.githubusercontent.com` | HTTPS | 更新提示 |
2525
| 17 | Claude in Chrome Bridge | `bridge.claudeusercontent.com` | WSS | Chrome 集成 |
@@ -121,12 +121,16 @@ Anthropic 托管的 MCP 服务器代理。
121121
- **端点**: `https://api.anthropic.com/mcp-registry/v0/servers?version=latest&visibility=commercial`
122122
- **文件**: `src/services/mcp/officialRegistry.ts`
123123

124-
### 14. Bing Search
124+
### 14. Web Search Pages
125125

126-
WebSearch 工具的默认适配器,抓取 Bing 搜索结果。
126+
WebSearch 工具支持直接抓取 Bing 搜索结果页面,也支持通过 Brave 的 LLM Context API
127+
获取搜索上下文;可通过 `WEB_SEARCH_ADAPTER=bing|brave` 显式切换后端。
127128

128-
- **端点**: `https://www.bing.com/search?q={query}&setmkt=en-US`
129-
- **文件**: `src/tools/WebSearchTool/adapters/bingAdapter.ts`
129+
- **Bing 端点**: `https://www.bing.com/search?q={query}&setmkt=en-US`
130+
- **Brave 端点**: `https://api.search.brave.com/res/v1/llm/context?q={query}`
131+
- **文件**:
132+
- `src/tools/WebSearchTool/adapters/bingAdapter.ts`
133+
- `src/tools/WebSearchTool/adapters/braveAdapter.ts`
130134

131135
另外还有 Domain Blocklist 查询:
132136
- **端点**: `https://api.anthropic.com/api/web/domain_info?domain={domain}`
@@ -201,6 +205,7 @@ WebSearch 工具的默认适配器,抓取 Bing 搜索结果。
201205
| `{region}-aiplatform.googleapis.com` | Google Vertex AI | HTTPS |
202206
| `{resource}.services.ai.azure.com` | Azure Foundry | HTTPS |
203207
| `www.bing.com` | Bing 搜索 | HTTPS |
208+
| `search.brave.com` | Brave 搜索 | HTTPS |
204209
| `storage.googleapis.com` | 自动更新 | HTTPS |
205210
| `raw.githubusercontent.com` | Changelog / 插件统计 | HTTPS |
206211
| `bridge.claudeusercontent.com` | Chrome Bridge | WSS |

docs/features/web-search-tool.md

Lines changed: 23 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
# WEB_SEARCH_TOOL — 网页搜索工具
22

3-
> 实现状态:适配器架构完成,Bing 适配器为当前默认后端
3+
> 实现状态:适配器架构完成,支持 API / Bing / Brave 三种后端
44
> 引用数:核心工具,无 feature flag 门控(始终启用)
55
66
## 一、功能概述
77

8-
WebSearchTool 让模型可以搜索互联网获取最新信息。原始实现仅支持 Anthropic API 服务端搜索(`web_search_20250305` server tool),在第三方代理端点下不可用。现已重构为适配器架构,新增 Bing 搜索页面解析作为 fallback,确保任何 API 端点都能使用搜索功能。
8+
WebSearchTool 让模型可以搜索互联网获取最新信息。原始实现仅支持 Anthropic API 服务端搜索(`web_search_20250305` server tool),在第三方代理端点下不可用。现已重构为适配器架构,支持 API 服务端搜索,以及 Bing / Brave 两个 HTML 解析后端,确保任何 API 端点都能使用搜索功能。
99

1010
## 二、实现架构
1111

@@ -21,9 +21,13 @@ WebSearchTool.call()
2121
│ └── 使用 web_search_20250305 server tool
2222
│ 通过 queryModelWithStreaming 二次调用 API
2323
24-
└── BingSearchAdapter — Bing HTML 抓取 + 正则提取(当前默认)
25-
└── 直接抓取 Bing 搜索页 HTML
26-
正则提取 b_algo 块中的标题/URL/摘要
24+
├── BingSearchAdapter — Bing HTML 抓取 + 正则提取
25+
│ └── 直接抓取 Bing 搜索页 HTML
26+
│ 正则提取 b_algo 块中的标题/URL/摘要
27+
28+
└── BraveSearchAdapter — Brave LLM Context API
29+
└── 调用 Brave HTTPS GET 接口
30+
将 grounding payload 映射为标题/URL/摘要
2731
```
2832

2933
### 2.2 模块结构
@@ -37,8 +41,9 @@ WebSearchTool.call()
3741
| 适配器工厂 | `src/tools/WebSearchTool/adapters/index.ts` | `createAdapter()` 工厂函数,选择后端 |
3842
| API 适配器 | `src/tools/WebSearchTool/adapters/apiAdapter.ts` | 封装原有 `queryModelWithStreaming` 逻辑,使用 server tool |
3943
| Bing 适配器 | `src/tools/WebSearchTool/adapters/bingAdapter.ts` | Bing HTML 抓取 + 正则解析 |
40-
| 单元测试 | `src/tools/WebSearchTool/__tests__/bingAdapter.test.ts` | 32 个测试用例 |
41-
| 集成测试 | `src/tools/WebSearchTool/__tests__/bingAdapter.integration.ts` | 真实网络请求验证 |
44+
| Brave 适配器 | `src/tools/WebSearchTool/adapters/braveAdapter.ts` | Brave LLM Context API 适配与结果映射 |
45+
| 单元测试 | `src/tools/WebSearchTool/__tests__/bingAdapter.test.ts`, `src/tools/WebSearchTool/__tests__/braveAdapter*.test.ts`, `src/tools/WebSearchTool/__tests__/adapterFactory.test.ts` | Bing / Brave 解析与工厂逻辑测试 |
46+
| 集成测试 | `src/tools/WebSearchTool/__tests__/bingAdapter.integration.ts`, `src/tools/WebSearchTool/__tests__/braveAdapter.integration.ts` | 真实网络请求验证 |
4247

4348
### 2.3 数据流
4449

@@ -49,20 +54,18 @@ WebSearchTool.call()
4954
validateInput() — 校验 query 非空、allowed/block 不共存
5055
5156
52-
createAdapter() → BingSearchAdapter(当前硬编码)
57+
createAdapter() → ApiSearchAdapter | BingSearchAdapter | BraveSearchAdapter
5358
5459
5560
adapter.search(query, { allowedDomains, blockedDomains, signal, onProgress })
5661
5762
├── onProgress({ type: 'query_update', query })
5863
59-
├── axios.get(bing.com/search?q=...&setmkt=en-US)
60-
│ └── 13 个 Edge 浏览器请求头
64+
├── axios.get(search-engine-url)
65+
│ └── API 鉴权请求头
6166
62-
├── extractBingResults(html) — 正则提取 <li class="b_algo"> 块
63-
│ ├── resolveBingUrl() — 解码 base64 重定向 URL
64-
│ ├── extractSnippet() — 三级降级摘要提取
65-
│ └── decodeHtmlEntities() — he.decode
67+
├── extractResults(payload) — 按后端提取结果
68+
│ └── grounding → SearchResult[] 映射
6669
6770
├── 客户端域名过滤 (allowedDomains / blockedDomains)
6871
@@ -117,19 +120,18 @@ Bing 返回的重定向 URL 格式:`bing.com/ck/a?...&u=a1aHR0cHM6Ly9...`
117120

118121
## 四、适配器选择逻辑
119122

120-
当前 `createAdapter()` 硬编码返回 `BingSearchAdapter`,原逻辑已注释保留
123+
`createAdapter()` 按以下优先级选择后端,并按选中的后端 key 缓存适配器实例
121124

122125
```typescript
123126
export function createAdapter(): WebSearchAdapter {
124-
return new BingSearchAdapter()
125-
// 注释保留的选择逻辑:
126-
// 1. WEB_SEARCH_ADAPTER 环境变量强制指定 api|bing
127-
// 2. isFirstPartyAnthropicBaseUrl() → API 适配器
128-
// 3. 第三方端点 → Bing 适配器
127+
// 1. WEB_SEARCH_ADAPTER=api|bing|brave 显式指定
128+
// 2. Anthropic 官方 API Base URL → ApiSearchAdapter
129+
// 3. 第三方代理 / 非官方端点 → BingSearchAdapter
129130
}
130131
```
131132

132-
恢复自动选择:取消 `index.ts` 中的注释即可。
133+
显式指定 `WEB_SEARCH_ADAPTER=brave` 时,会改用 Brave LLM Context API 后端,并要求
134+
`BRAVE_SEARCH_API_KEY``BRAVE_API_KEY`
133135

134136
## 五、接口定义
135137

docs/tools/search-and-navigation.mdx

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -146,14 +146,15 @@ AI 的信息获取不局限于本地代码:
146146

147147
### WebSearch 实现机制
148148

149-
WebSearch 通过适配器模式支持两种搜索后端,由 `src/tools/WebSearchTool/adapters/` 中的工厂函数 `createAdapter()` 选择:
149+
WebSearch 通过适配器模式支持三种搜索后端,由 `src/tools/WebSearchTool/adapters/` 中的工厂函数 `createAdapter()` 选择:
150150

151151
```
152152
适配器架构:
153153
WebSearchTool.call()
154154
→ createAdapter() 选择后端
155155
├─ ApiSearchAdapter — Anthropic API 服务端搜索(需官方 API 密钥)
156-
└─ BingSearchAdapter — 直接抓取 Bing 搜索页面解析(无需 API 密钥)
156+
├─ BingSearchAdapter — 直接抓取 Bing 搜索页面解析(无需 API 密钥)
157+
└─ BraveSearchAdapter — 调用 Brave LLM Context API 解析(需 Brave API 密钥)
157158
→ adapter.search(query, options)
158159
→ 转换为统一 SearchResult[] 格式返回
159160
```
@@ -166,8 +167,9 @@ WebSearch 通过适配器模式支持两种搜索后端,由 `src/tools/WebSear
166167
|--------|------|--------|
167168
| 1 | 环境变量 `WEB_SEARCH_ADAPTER=api` | `ApiSearchAdapter` |
168169
| 2 | 环境变量 `WEB_SEARCH_ADAPTER=bing` | `BingSearchAdapter` |
169-
| 3 | API Base URL 指向 Anthropic 官方 | `ApiSearchAdapter` |
170-
| 4 | 第三方代理 / 非官方端点 | `BingSearchAdapter` |
170+
| 3 | 环境变量 `WEB_SEARCH_ADAPTER=brave` | `BraveSearchAdapter` |
171+
| 4 | API Base URL 指向 Anthropic 官方 | `ApiSearchAdapter` |
172+
| 5 | 第三方代理 / 非官方端点 | `BingSearchAdapter` |
171173

172174
适配器是无状态的,同一会话内缓存复用。
173175

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
import { afterEach, describe, expect, mock, test } from 'bun:test'
2+
3+
let isFirstPartyBaseUrl = true
4+
5+
mock.module('../adapters/apiAdapter.js', () => ({
6+
ApiSearchAdapter: class ApiSearchAdapter {},
7+
}))
8+
9+
mock.module('../adapters/bingAdapter.js', () => ({
10+
BingSearchAdapter: class BingSearchAdapter {},
11+
}))
12+
13+
mock.module('../adapters/braveAdapter.js', () => ({
14+
BraveSearchAdapter: class BraveSearchAdapter {},
15+
}))
16+
17+
mock.module('../../../utils/model/providers.js', () => ({
18+
isFirstPartyAnthropicBaseUrl: () => isFirstPartyBaseUrl,
19+
}))
20+
21+
const { createAdapter } = await import('../adapters/index')
22+
23+
const originalWebSearchAdapter = process.env.WEB_SEARCH_ADAPTER
24+
25+
afterEach(() => {
26+
isFirstPartyBaseUrl = true
27+
28+
if (originalWebSearchAdapter === undefined) {
29+
delete process.env.WEB_SEARCH_ADAPTER
30+
} else {
31+
process.env.WEB_SEARCH_ADAPTER = originalWebSearchAdapter
32+
}
33+
})
34+
35+
describe('createAdapter', () => {
36+
test('reuses the same instance when the selected backend does not change', () => {
37+
process.env.WEB_SEARCH_ADAPTER = 'brave'
38+
39+
const firstAdapter = createAdapter()
40+
const secondAdapter = createAdapter()
41+
42+
expect(firstAdapter).toBe(secondAdapter)
43+
expect(firstAdapter.constructor.name).toBe('BraveSearchAdapter')
44+
})
45+
46+
test('rebuilds the adapter when WEB_SEARCH_ADAPTER changes', () => {
47+
process.env.WEB_SEARCH_ADAPTER = 'brave'
48+
const braveAdapter = createAdapter()
49+
50+
process.env.WEB_SEARCH_ADAPTER = 'bing'
51+
const bingAdapter = createAdapter()
52+
53+
expect(bingAdapter).not.toBe(braveAdapter)
54+
expect(bingAdapter.constructor.name).toBe('BingSearchAdapter')
55+
})
56+
57+
test('selects the API adapter for first-party Anthropic URLs', () => {
58+
delete process.env.WEB_SEARCH_ADAPTER
59+
isFirstPartyBaseUrl = true
60+
61+
expect(createAdapter().constructor.name).toBe('ApiSearchAdapter')
62+
})
63+
64+
test('selects the Bing adapter for third-party Anthropic base URLs', () => {
65+
delete process.env.WEB_SEARCH_ADAPTER
66+
isFirstPartyBaseUrl = false
67+
68+
expect(createAdapter().constructor.name).toBe('BingSearchAdapter')
69+
})
70+
})
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
import { describe, expect, test } from 'bun:test'
2+
import { extractBraveResults } from '../adapters/braveAdapter'
3+
4+
describe('extractBraveResults', () => {
5+
test('extracts generic grounding results', () => {
6+
const results = extractBraveResults({
7+
grounding: {
8+
generic: [
9+
{
10+
title: 'Example Title 1',
11+
url: 'https://example.com/page1',
12+
snippets: ['First result description'],
13+
},
14+
{
15+
title: 'Example Title 2',
16+
url: 'https://example.com/page2',
17+
snippets: ['Second result description'],
18+
},
19+
],
20+
},
21+
})
22+
23+
expect(results).toEqual([
24+
{
25+
title: 'Example Title 1',
26+
url: 'https://example.com/page1',
27+
snippet: 'First result description',
28+
},
29+
{
30+
title: 'Example Title 2',
31+
url: 'https://example.com/page2',
32+
snippet: 'Second result description',
33+
},
34+
])
35+
})
36+
37+
test('combines generic, poi, and map grounding results', () => {
38+
const results = extractBraveResults({
39+
grounding: {
40+
generic: [{ title: 'Generic', url: 'https://example.com/generic' }],
41+
poi: { title: 'POI', url: 'https://maps.example.com/poi' },
42+
map: [{ title: 'Map', url: 'https://maps.example.com/map' }],
43+
},
44+
})
45+
46+
expect(results).toEqual([
47+
{ title: 'Generic', url: 'https://example.com/generic', snippet: undefined },
48+
{ title: 'POI', url: 'https://maps.example.com/poi', snippet: undefined },
49+
{ title: 'Map', url: 'https://maps.example.com/map', snippet: undefined },
50+
])
51+
})
52+
53+
test('joins multiple snippets into one summary string', () => {
54+
const results = extractBraveResults({
55+
grounding: {
56+
generic: [
57+
{
58+
title: 'Joined Snippets',
59+
url: 'https://example.com/joined',
60+
snippets: ['First snippet.', 'Second snippet.'],
61+
},
62+
],
63+
},
64+
})
65+
66+
expect(results[0].snippet).toBe('First snippet. Second snippet.')
67+
})
68+
69+
test('skips entries without a title or URL', () => {
70+
const results = extractBraveResults({
71+
grounding: {
72+
generic: [
73+
{ title: 'Missing URL' },
74+
{ url: 'https://example.com/missing-title' },
75+
{ title: 'Valid', url: 'https://example.com/valid' },
76+
],
77+
},
78+
})
79+
80+
expect(results).toEqual([
81+
{ title: 'Valid', url: 'https://example.com/valid', snippet: undefined },
82+
])
83+
})
84+
85+
test('deduplicates repeated URLs across grounding buckets', () => {
86+
const results = extractBraveResults({
87+
grounding: {
88+
generic: [{ title: 'First', url: 'https://example.com/dup' }],
89+
poi: { title: 'Second', url: 'https://example.com/dup' },
90+
map: [{ title: 'Third', url: 'https://example.com/dup' }],
91+
},
92+
})
93+
94+
expect(results).toEqual([
95+
{ title: 'First', url: 'https://example.com/dup', snippet: undefined },
96+
])
97+
})
98+
99+
test('returns empty array when grounding is missing', () => {
100+
expect(extractBraveResults({})).toEqual([])
101+
})
102+
103+
test('returns empty array when grounding arrays are absent', () => {
104+
expect(extractBraveResults({ grounding: {} })).toEqual([])
105+
})
106+
})

0 commit comments

Comments
 (0)