-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add SemaphoreCI OIDC trusted provider support #19048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
warehouse/migrations/versions/7a97c540ed60_add_semaphoreci_oidc_models.py
Outdated
Show resolved
Hide resolved
warehouse/oidc/models/semaphore.py
Outdated
| # Extract the repo portion from the sub claim | ||
| if ":repo:" not in signed_claim: | ||
| return False | ||
|
|
||
| parts = signed_claim.split(":repo:") | ||
| if len(parts) < 2: # pragma: no cover | ||
| # This should be unreachable since we already checked ":repo:" is in the string | ||
| return False | ||
|
|
||
| # The repo value is between :repo: and the next : | ||
| repo_and_rest = parts[1] | ||
| repo_parts = repo_and_rest.split(":", 1) | ||
| repo_in_sub = repo_parts[0] | ||
|
|
||
| if not repo_in_sub: | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be simplified a bit:
| # Extract the repo portion from the sub claim | |
| if ":repo:" not in signed_claim: | |
| return False | |
| parts = signed_claim.split(":repo:") | |
| if len(parts) < 2: # pragma: no cover | |
| # This should be unreachable since we already checked ":repo:" is in the string | |
| return False | |
| # The repo value is between :repo: and the next : | |
| repo_and_rest = parts[1] | |
| repo_parts = repo_and_rest.split(":", 1) | |
| repo_in_sub = repo_parts[0] | |
| if not repo_in_sub: | |
| return False | |
| # There should only be one ':repo:' in the sub: | |
| if signed_claim.count(":repo:") != 1: | |
| return False | |
| _, repo_and_rest = signed_claim.split(":repo:", 1) | |
| repo_in_sub, _ = repo_and_rest.split(":", 1) | |
| if not repo_in_sub: | |
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified
warehouse/oidc/models/semaphore.py
Outdated
| "org", | ||
| "org_id", | ||
| "prj", | ||
| "prj_id", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with SemaphoreCI but I'm surprised we wouldn't want to restrict the token based on these claims? (Even if they are duplicated in the subject?) Do you have any more details about them?
Specifically, I would assume that the _id claims could help us prevent resurrection attacks, unless this is not possible with SemaphoreCI?
At the very least, it would be good to check that these match what's configured in addition to checking the subject and repo_slug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added org_id and prj_id to the checked claims to prevent resurrection attacks
warehouse/oidc/models/semaphore.py
Outdated
| repo_slug = signed_claims.get("repo_slug") | ||
| if not repo_slug: | ||
| raise InvalidPublisherError("Missing 'repo_slug' claim") | ||
|
|
||
| # repo_slug format: owner/repository | ||
| if "/" not in repo_slug: | ||
| raise InvalidPublisherError( | ||
| f"Invalid 'repo_slug' claim format: {repo_slug!r}, " | ||
| "expected 'owner/repository'" | ||
| ) | ||
|
|
||
| organization, project = repo_slug.split("/", 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to use repo_slug here rather than the org/prj claims?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't correct/sufficient. I've updated the logic. Previously, it used git organization + repo name. The new logic uses Semaphore organization name, Semaphore project name, and repo slug (which contains Git organization + repo name)
| from sqlalchemy.orm import Session | ||
|
|
||
|
|
||
| SEMAPHORE_OIDC_ISSUER_URL_SUFFIX = ".semaphoreci.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting! Generally our model has been either to support a specific issuer URL or to support custom issuer URLs on a per-organization basis.
Does this mean that there is a different issuer for every SemaphoreCI project? Or generally, what are the expected values here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that there is a different issuer for every SemaphoreCI project? Or generally, what are the expected values here?
Yes, the expected values are of the format https://<org-name>.semaphoreci.com where org-name is the name of the Semaphore organization (not the GitHub organization). Ref https://docs.semaphore.io/reference/openid#reference.
I think the best way to test this would be to put it behind a feature flag and enable it only for test.pypi.org first. |
This attempts to implement #18882.
The automated tests are all passing, but I don't have confidence this actually works because I don't know how to test the actual integration. I have access to and familiarity with Semaphore, and I could test using the 2 SaaS services, if the warehouse were deployed. Though, I am hoping there is an easier way to test this, so I could use some guidance on how to proceed.