Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Glean MCP Server Configuration
# Copy this file to .env and fill in your values

# Your Glean server URL (copy from your Glean admin panel)
GLEAN_SERVER_URL=https://your-company-be.glean.com/

# Your Glean API token (generate from Glean settings)
GLEAN_API_TOKEN=your_api_token_here

# Optional: User to impersonate (only valid with global tokens)
# [email protected]

# Alternative configuration (legacy):
# GLEAN_INSTANCE=your-company # Note: -be is automatically appended
# GLEAN_BASE_URL=https://your-company-be.glean.com/
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ build
sandbox
sand\ box
debug.log
.env
126 changes: 126 additions & 0 deletions docs/pagination.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Pagination Support in Glean MCP Server

The Glean MCP Server now supports pagination for search results and chat responses, helping to manage large result sets and prevent token limit errors.

## Search Pagination

Both `company_search` and `people_profile_search` tools support pagination through the `cursor` parameter.

### Basic Usage

```json
// First request
{
"query": "Docker projects",
"pageSize": 20
}

// Response includes pagination info
{
"results": [...],
"cursor": "abc123",
"hasMoreResults": true,
"totalResults": 150
}

// Next page request
{
"query": "Docker projects",
"pageSize": 20,
"cursor": "abc123"
}
```

### People Search Example

```json
// Initial search
{
"query": "DevOps engineers",
"filters": {
"department": "Engineering"
},
"pageSize": 25
}

// Continue with cursor from response
{
"query": "DevOps engineers",
"filters": {
"department": "Engineering"
},
"pageSize": 25,
"cursor": "next-page-cursor"
}
```

## Chat Response Chunking

The chat tool automatically chunks large responses that exceed token limits (~25k tokens).

### Automatic Chunking

When a chat response is too large, it's automatically split into manageable chunks:

```json
// Initial chat request
{
"message": "Explain all our microservices architecture"
}

// Response with chunk metadata
{
"content": "... first part of response ...",
"_chunkMetadata": {
"responseId": "uuid-123",
"chunkIndex": 0,
"totalChunks": 3,
"hasMore": true
}
}
```

### Continuing Chunked Responses

To get subsequent chunks:

```json
{
"message": "",
"continueFrom": {
"responseId": "uuid-123",
"chunkIndex": 1
}
}
```

## Implementation Details

### Token Limits
- Maximum tokens per response: 20,000 (safe limit below 25k)
- Character to token ratio: ~4 characters per token

### Chunking Strategy
1. Attempts to split at paragraph boundaries (double newlines)
2. Falls back to sentence boundaries if paragraphs are too large
3. Force splits at character level for extremely long unbroken text

### Response Format
All paginated responses include:
- `cursor` or `_chunkMetadata`: Pagination state
- `hasMoreResults` or `hasMore`: Boolean indicating more data available
- `totalResults` or `totalChunks`: Total count when available

## Best Practices

1. **Set appropriate page sizes**: Balance between response size and number of requests
2. **Handle pagination in loops**: When fetching all results, continue until `hasMoreResults` is false
3. **Store cursors**: Keep track of cursors for user sessions to allow navigation
4. **Error handling**: Always check for continuation metadata before attempting to continue

## Error Handling

Common errors:
- Invalid cursor: Returns error if cursor is expired or invalid
- Invalid chunk index: Returns null if chunk doesn't exist
- Missing continuation data: Normal chat response if no previous chunks exist
4 changes: 2 additions & 2 deletions packages/configure-mcp-server/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ import {
forceRefreshTokens,
setupMcpRemote,
} from '@gleanwork/mcp-server-utils/auth';
import { chat, formatResponse } from '@gleanwork/local-mcp-server/tools/chat';
import { chat, formatChunkedResponse } from '@gleanwork/local-mcp-server/tools/chat';
import { VERSION } from './common/version.js';
import { checkAndOpenLaunchWarning } from '@gleanwork/mcp-server-utils/util';

Expand Down Expand Up @@ -277,7 +277,7 @@ connect after configuration.
case 'auth-test': {
try {
const chatResponse = await chat({ message: 'Who am I?' });
trace('auth-test search', formatResponse(chatResponse));
trace('auth-test search', formatChunkedResponse(chatResponse));
console.log('Access token accepted.');
} catch (err: any) {
error('auth-test error', err);
Expand Down
68 changes: 60 additions & 8 deletions packages/local-mcp-server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,30 +9,36 @@ The Glean MCP Server is a [Model Context Protocol (MCP)](https://modelcontextpro

## Features

- **Company Search**: Access Glean's powerful content search capabilities
- **People Profile Search**: Access Glean's people directory
- **Chat**: Interact with Glean's AI assistant
- **Company Search**: Access Glean's powerful content search capabilities with pagination support
- **People Profile Search**: Access Glean's people directory with pagination support
- **Chat**: Interact with Glean's AI assistant with automatic response chunking for large responses
- **Read Documents**: Retrieve documents from Glean by ID or URL
- **Pagination Support**: Handle large result sets efficiently with cursor-based pagination
- **Response Chunking**: Automatically splits large chat responses to avoid token limits
- **MCP Compliant**: Implements the Model Context Protocol specification

## Tools

- ### company_search

Search Glean's content index using the Glean Search API. This tool allows you to query Glean's content index with various filtering and configuration options.
Search Glean's content index using the Glean Search API. This tool allows you to query Glean's content index with various filtering and configuration options. Supports pagination through cursor parameter for handling large result sets.

- ### chat

Interact with Glean's AI assistant using the Glean Chat API. This tool allows you to have conversational interactions with Glean's AI, including support for message history, citations, and various configuration options.
Interact with Glean's AI assistant using the Glean Chat API. This tool allows you to have conversational interactions with Glean's AI, including support for message history, citations, and various configuration options. Automatically chunks large responses to avoid token limits and provides continuation support.

- ### people_profile_search

Search Glean's People directory to find employee information.
Search Glean's People directory to find employee information. Supports pagination through cursor parameter for handling large result sets.

- ### read_documents

Read documents from Glean by providing document IDs or URLs. This tool allows you to retrieve the full content of specific documents for detailed analysis or reference.

## Pagination

For detailed information about pagination support and examples, see [Pagination Documentation](../../docs/pagination.md).

## MCP Client Configuration

To configure this MCP server in your MCP client (such as Claude Desktop, Windsurf, Cursor, etc.), run [@gleanwork/configure-mcp-server](https://github.com/gleanwork/mcp-server/tree/main/packages/configure-mcp-server) passing in your client, token and instance.
Expand All @@ -58,15 +64,61 @@ To manually configure an MCP client (such as Claude Desktop, Windsurf, Cursor, e
"command": "npx",
"args": ["-y", "@gleanwork/local-mcp-server"],
"env": {
"GLEAN_INSTANCE": "<glean instance name>",
"GLEAN_SERVER_URL": "<your server URL from Glean admin panel>",
"GLEAN_API_TOKEN": "<glean api token>"
}
}
}
}
```

Replace the environment variable values with your actual Glean credentials.
Example values:
- `GLEAN_SERVER_URL`: `https://acme-corp-be.glean.com/` (copy from your Glean admin panel)
- `GLEAN_API_TOKEN`: Your API token from Glean settings

Alternative configuration (legacy - note that `-be` is automatically appended):
```json
"env": {
"GLEAN_INSTANCE": "acme-corp", // becomes https://acme-corp-be.glean.com/
"GLEAN_API_TOKEN": "<glean api token>"
}
```

### Local Development

For local development, you can use a `.env` file to store your credentials:

1. Copy the example environment file:
```bash
cp ../../.env.example ../../.env
```

2. Edit `.env` with your values:
```bash
# .env
GLEAN_SERVER_URL=https://your-company-be.glean.com/
GLEAN_API_TOKEN=your_api_token_here
```

3. Run the server locally:
```bash
npm run build
node build/index.js
```

3. For use with MCP clients during development:
```json
{
"mcpServers": {
"glean-dev": {
"command": "node",
"args": ["/path/to/packages/local-mcp-server/build/index.js"]
}
}
}
```

The server will automatically load environment variables from the `.env` file.

### Debugging

Expand Down
1 change: 1 addition & 0 deletions packages/local-mcp-server/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
* @module @gleanwork/local-mcp-server
*/

import 'dotenv/config';
import meow from 'meow';
import { runServer } from './server.js';
import { Logger, trace, LogLevel } from '@gleanwork/mcp-server-utils/logger';
Expand Down
35 changes: 30 additions & 5 deletions packages/local-mcp-server/src/server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -60,46 +60,71 @@ export async function listToolsHandler() {
name: TOOL_NAMES.companySearch,
description: `Find relevant company documents and data

Example request:
Example requests:

// Basic search
{
"query": "What are the company holidays this year?",
"datasources": ["drive", "confluence"]
}

// Search with pagination
{
"query": "Docker projects",
"pageSize": 20,
"cursor": "pagination_cursor" // From previous response
}
`,
inputSchema: zodToJsonSchema(search.ToolSearchSchema),
},
{
name: TOOL_NAMES.chat,
description: `Chat with Glean Assistant using Glean's RAG

Example request:
Example requests:

// Basic chat
{
"message": "What are the company holidays this year?",
"context": [
"Hello, I need some information about time off.",
"I'm planning my vacation for next year."
]
}

// Continue from chunked response
{
"message": "",
"continueFrom": {
"responseId": "uuid-here",
"chunkIndex": 1
}
}
`,
inputSchema: zodToJsonSchema(chat.ToolChatSchema),
},
{
name: TOOL_NAMES.peopleProfileSearch,
description: `Search for people profiles in the company

Example request:
Example requests:

// Basic search
{
"query": "Find people named John Doe",
"filters": {
"department": "Engineering",
"department": "Engineering",
"city": "San Francisco"
},
"pageSize": 10
}

// Search with pagination
{
"query": "DevOps engineers",
"pageSize": 25,
"cursor": "pagination_cursor" // From previous response
}
`,
inputSchema: zodToJsonSchema(
peopleProfileSearch.ToolPeopleProfileSearchSchema,
Expand Down Expand Up @@ -152,7 +177,7 @@ export async function callToolHandler(
case TOOL_NAMES.chat: {
const args = chat.ToolChatSchema.parse(request.params.arguments);
const result = await chat.chat(args);
const formattedResults = chat.formatResponse(result);
const formattedResults = chat.formatChunkedResponse(result);

return {
content: [{ type: 'text', text: formattedResults }],
Expand Down
Loading