Skip to content

Commit 5073c7d

Browse files
[experimental] feat(ee): GitHub permission syncing (#508)
1 parent a76ae68 commit 5073c7d

File tree

57 files changed

+2144
-1226
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+2144
-1226
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
### Added
11+
- [Experimental][Sourcebot EE] Added permission syncing repository Access Control Lists (ACLs) between Sourcebot and GitHub. [#508](https://github.com/sourcebot-dev/sourcebot/pull/508)
12+
1013
### Changed
1114
- Improved repository query performance by adding db indices. [#526](https://github.com/sourcebot-dev/sourcebot/pull/526)
1215
- Improved repository query performance by removing JOIN on `Connection` table. [#527](https://github.com/sourcebot-dev/sourcebot/pull/527)

LICENSE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ Copyright (c) 2025 Taqla Inc.
22

33
Portions of this software are licensed as follows:
44

5-
- All content that resides under the "ee/", "packages/web/src/ee/", and "packages/shared/src/ee/" directories of this repository, if these directories exist, is licensed under the license defined in "ee/LICENSE".
5+
- All content that resides under the "ee/", "packages/web/src/ee/", "packages/backend/src/ee/", and "packages/shared/src/ee/" directories of this repository, if these directories exist, is licensed under the license defined in "ee/LICENSE".
66
- All third party components incorporated into the Sourcebot Software are licensed under the original license provided by the owner of the applicable component.
77
- Content outside of the above mentioned directories or restrictions above is available under the "Functional Source License" as defined below.
88

docs/docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
"docs/features/code-navigation",
4747
"docs/features/analytics",
4848
"docs/features/mcp-server",
49+
"docs/features/permission-syncing",
4950
{
5051
"group": "Agents",
5152
"tag": "experimental",

docs/docs/configuration/config-file.mdx

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -33,17 +33,19 @@ Sourcebot syncs the config file on startup, and automatically whenever a change
3333

3434
The following are settings that can be provided in your config file to modify Sourcebot's behavior
3535

36-
| Setting | Type | Default | Minimum | Description / Notes |
37-
|-------------------------------------------|---------|------------|---------|----------------------------------------------------------------------------------------|
38-
| `maxFileSize` | number | 2 MB | 1 | Maximum size (bytes) of a file to index. Files exceeding this are skipped. |
39-
| `maxTrigramCount` | number | 20 000 | 1 | Maximum trigrams per document. Larger files are skipped. |
40-
| `reindexIntervalMs` | number | 1 hour | 1 | Interval at which all repositories are re‑indexed. |
41-
| `resyncConnectionIntervalMs` | number | 24 hours | 1 | Interval for checking connections that need re‑syncing. |
42-
| `resyncConnectionPollingIntervalMs` | number | 1 second | 1 | DB polling rate for connections that need re‑syncing. |
43-
| `reindexRepoPollingIntervalMs` | number | 1 second | 1 | DB polling rate for repos that should be re‑indexed. |
44-
| `maxConnectionSyncJobConcurrency` | number | 8 | 1 | Concurrent connection‑sync jobs. |
45-
| `maxRepoIndexingJobConcurrency` | number | 8 | 1 | Concurrent repo‑indexing jobs. |
46-
| `maxRepoGarbageCollectionJobConcurrency` | number | 8 | 1 | Concurrent repo‑garbage‑collection jobs. |
47-
| `repoGarbageCollectionGracePeriodMs` | number | 10 seconds | 1 | Grace period to avoid deleting shards while loading. |
48-
| `repoIndexTimeoutMs` | number | 2 hours | 1 | Timeout for a single repo‑indexing run. |
49-
| `enablePublicAccess` **(deprecated)** | boolean | false || Use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead. |
36+
| Setting | Type | Default | Minimum | Description / Notes |
37+
|-------------------------------------------------|---------|------------|---------|----------------------------------------------------------------------------------------|
38+
| `maxFileSize` | number | 2 MB | 1 | Maximum size (bytes) of a file to index. Files exceeding this are skipped. |
39+
| `maxTrigramCount` | number | 20 000 | 1 | Maximum trigrams per document. Larger files are skipped. |
40+
| `reindexIntervalMs` | number | 1 hour | 1 | Interval at which all repositories are re‑indexed. |
41+
| `resyncConnectionIntervalMs` | number | 24 hours | 1 | Interval for checking connections that need re‑syncing. |
42+
| `resyncConnectionPollingIntervalMs` | number | 1 second | 1 | DB polling rate for connections that need re‑syncing. |
43+
| `reindexRepoPollingIntervalMs` | number | 1 second | 1 | DB polling rate for repos that should be re‑indexed. |
44+
| `maxConnectionSyncJobConcurrency` | number | 8 | 1 | Concurrent connection‑sync jobs. |
45+
| `maxRepoIndexingJobConcurrency` | number | 8 | 1 | Concurrent repo‑indexing jobs. |
46+
| `maxRepoGarbageCollectionJobConcurrency` | number | 8 | 1 | Concurrent repo‑garbage‑collection jobs. |
47+
| `repoGarbageCollectionGracePeriodMs` | number | 10 seconds | 1 | Grace period to avoid deleting shards while loading. |
48+
| `repoIndexTimeoutMs` | number | 2 hours | 1 | Timeout for a single repo‑indexing run. |
49+
| `enablePublicAccess` **(deprecated)** | boolean | false || Use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead. |
50+
| `experiment_repoDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 | Interval at which the repo permission syncer should run. |
51+
| `experiment_userDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 | Interval at which the user permission syncer should run. |

docs/docs/configuration/environment-variables.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ The following environment variables allow you to configure your Sourcebot deploy
5959
| `AUTH_EE_OKTA_ISSUER` | `-` | <p>The issuer URL for Okta SSO authentication.</p> |
6060
| `AUTH_EE_GCP_IAP_ENABLED` | `false` | <p>When enabled, allows Sourcebot to automatically register/login from a successful GCP IAP redirect</p> |
6161
| `AUTH_EE_GCP_IAP_AUDIENCE` | - | <p>The GCP IAP audience to use when verifying JWT tokens. Must be set to enable GCP IAP JIT provisioning</p> |
62+
| `EXPERIMENT_EE_PERMISSION_SYNC_ENABLED` | `false` | <p>Enables [permission syncing](/docs/features/permission-syncing).</p> |
6263

6364

6465
### Review Agent Environment Variables

docs/docs/connections/github.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -196,4 +196,8 @@ To connect to a GitHub host other than `github.com`, provide the `url` property
196196

197197
<GitHubSchema />
198198

199-
</Accordion>
199+
</Accordion>
200+
201+
## See also
202+
203+
- [Syncing GitHub Access permissions to Sourcebot](/docs/features/permission-syncing#github)

docs/docs/features/agents/overview.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@ title: "Agents Overview"
33
sidebarTitle: "Overview"
44
---
55

6-
<Warning>
7-
Agents are currently a experimental feature. Have an idea for an agent that we haven't built? Submit a [feature request](https://github.com/sourcebot-dev/sourcebot/issues/new?template=feature_request.md) on our GitHub.
8-
</Warning>
6+
import ExperimentalFeatureWarning from '/snippets/experimental-feature-warning.mdx'
7+
8+
<ExperimentalFeatureWarning />
99

1010
Agents are automations that leverage the code indexed on Sourcebot to perform a specific task. Once you've setup Sourcebot, check out the
1111
guides below to configure additional agents.
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
title: "Permission syncing"
3+
sidebarTitle: "Permission syncing"
4+
tag: "experimental"
5+
---
6+
7+
import LicenseKeyRequired from '/snippets/license-key-required.mdx'
8+
import ExperimentalFeatureWarning from '/snippets/experimental-feature-warning.mdx'
9+
10+
<LicenseKeyRequired />
11+
<ExperimentalFeatureWarning />
12+
13+
# Overview
14+
15+
Permission syncing allows you to sync Access Permission Lists (ACLs) from a code host to Sourcebot. When configured, users signed into Sourcebot (via the code host's OAuth provider) will only be able to access repositories that they have access to on the code host. Practically, this means:
16+
17+
- Code Search results will only include repositories that the user has access to.
18+
- Code navigation results will only include repositories that the user has access to.
19+
- Ask Sourcebot (and the underlying LLM) will only have access to repositories that the user has access to.
20+
- File browsing is scoped to the repositories that the user has access to.
21+
22+
Permission syncing can be enabled by setting the `EXPERIMENT_EE_PERMISSION_SYNC_ENABLED` environment variable to `true`.
23+
24+
```bash
25+
docker run \
26+
-e EXPERIMENT_EE_PERMISSION_SYNC_ENABLED=true \
27+
/* additional args */ \
28+
ghcr.io/sourcebot-dev/sourcebot:latest
29+
```
30+
31+
## Platform support
32+
33+
We are actively working on supporting more code hosts. If you'd like to see a specific code host supported, please [reach out](https://www.sourcebot.dev/contact).
34+
35+
| Platform | Permission syncing |
36+
|:----------|------------------------------|
37+
| [GitHub (GHEC & GHEC Server)](/docs/features/permission-syncing#github) ||
38+
| GitLab | 🛑 |
39+
| Bitbucket Cloud | 🛑 |
40+
| Bitbucket Data Center | 🛑 |
41+
| Gitea | 🛑 |
42+
| Gerrit | 🛑 |
43+
| Generic git host | 🛑 |
44+
45+
# Getting started
46+
47+
## GitHub
48+
49+
Prerequisite: [Add GitHub as an OAuth provider](/docs/configuration/auth/providers#github).
50+
51+
Permission syncing works with **GitHub.com**, **GitHub Enterprise Cloud**, and **GitHub Enterprise Server**. For organization-owned repositories, users that have **read-only** access (or above) via the following methods will have their access synced to Sourcebot:
52+
- Outside collaborators
53+
- Organization members that are direct collaborators
54+
- Organization members with access through team memberships
55+
- Organization members with access through default organization permissions
56+
- Organization owners.
57+
58+
**Notes:**
59+
- A GitHub OAuth provider must be configured to (1) correlate a Sourcebot user with a GitHub user, and (2) to list repositories that the user has access to for [User driven syncing](/docs/features/permission-syncing#how-it-works).
60+
- OAuth tokens must assume the `repo` scope in order to use the [List repositories for the authenticated user API](https://docs.github.com/en/rest/repos/repos?apiVersion=2022-11-28#list-repositories-for-the-authenticated-user) during [User driven syncing](/docs/features/permission-syncing#how-it-works). Sourcebot **will only** use this token for **reads**.
61+
62+
# How it works
63+
64+
Permission syncing works by periodically syncing ACLs from the code host(s) to Sourcebot to build an internal mapping between Users and Repositories. This mapping is hydrated in two directions:
65+
- **User driven** : fetches the list of all repositories that a given user has access to.
66+
- **Repo driven** : fetches the list of all users that have access to a given repository.
67+
68+
User driven and repo driven syncing occurs every 24 hours by default. These intervals can be configured using the following settings in the [config file](/docs/configuration/config-file):
69+
| Setting | Type | Default | Minimum |
70+
|-------------------------------------------------|---------|------------|---------|
71+
| `experiment_repoDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 |
72+
| `experiment_userDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 |
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
2+
<Warning>
3+
This is an experimental feature. Certain functionality may be incomplete and breaking changes may ship in non-major releases. Have feedback? Submit a [issue](https://github.com/sourcebot-dev/sourcebot/issues) on GitHub.
4+
</Warning>

docs/snippets/schemas/v3/index.schema.mdx

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,16 @@
6969
"deprecated": true,
7070
"description": "This setting is deprecated. Please use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead.",
7171
"default": false
72+
},
73+
"experiment_repoDrivenPermissionSyncIntervalMs": {
74+
"type": "number",
75+
"description": "The interval (in milliseconds) at which the repo permission syncer should run. Defaults to 24 hours.",
76+
"minimum": 1
77+
},
78+
"experiment_userDrivenPermissionSyncIntervalMs": {
79+
"type": "number",
80+
"description": "The interval (in milliseconds) at which the user permission syncer should run. Defaults to 24 hours.",
81+
"minimum": 1
7282
}
7383
},
7484
"additionalProperties": false
@@ -195,6 +205,16 @@
195205
"deprecated": true,
196206
"description": "This setting is deprecated. Please use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead.",
197207
"default": false
208+
},
209+
"experiment_repoDrivenPermissionSyncIntervalMs": {
210+
"type": "number",
211+
"description": "The interval (in milliseconds) at which the repo permission syncer should run. Defaults to 24 hours.",
212+
"minimum": 1
213+
},
214+
"experiment_userDrivenPermissionSyncIntervalMs": {
215+
"type": "number",
216+
"description": "The interval (in milliseconds) at which the user permission syncer should run. Defaults to 24 hours.",
217+
"minimum": 1
198218
}
199219
},
200220
"additionalProperties": false

0 commit comments

Comments
 (0)