Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve control over redirects during auth code flows #248

Merged
merged 5 commits into from
Aug 30, 2023

Conversation

atheriel
Copy link
Collaborator

@atheriel atheriel commented Jul 5, 2023

When sheparding a user through an interactive authorisation code flow to access APIs protected by OAuth2, the oauth_flow_auth_code() function runs a temporary webserver on localhost to capture the code. This is a very common technique for OAuth client libraries.

Unfortunately, this doesn't work in environments where localhost is probably inaccessible to users -- which is normally the case on RStudio Server and Posit Workbench. Previously the oauth_flow_auth_code() function would hang indefinitely in these environments.

This commit fixes this failure mode by attempting to detect when this temporary localhost server will actually work and only runs one if it will.

Of course, this doesn't resolve the issue for users that can't run a localhost server -- so in this situation, we now allow users to manually paste the authorisation code into the console from... somewhere. In the likely event that they don't want to be redirected to localhost for this step, this commit also fulfills the longstanding request for control over the redirect URL.

The combination of a manual copy & paste step with a custom redirect URL is what the gargle package terms the "pseudo out-of-band" flow, and the design used here is largely based on that package.

Because of this compatibility, users could register the existing public page at https://www.tidyverse.org/google-callback/ with their OAuth provider, host their own clone, or create a variant.

Control over the redirect URL is backwards-compatible but does require making the host_name parameter deprecated, unfortunately.

Unit tests for new functions are included, as are various updates to the documentation.

Fixes #165 and #176.

@atheriel atheriel force-pushed the aj-oauth-auth-code-improvements branch from 21deb15 to faa980d Compare July 5, 2023 16:29
Copy link
Member

@jennybc jennybc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exciting!

I feel like we should discuss this recommendation:

Because of this compatibility, users could register the existing public page at https://www.tidyverse.org/google-callback/ with their OAuth provider, host their own clone, or create a variant.

I've gotten the question over in gargle land whether it's OK for other folks to use https://www.tidyverse.org/google-callback/ and, while I can't see any reason to say "no, please don't", it also feels a bit odd. How do we feel about that? 🤔

#' low-level functions can be used to assemble a custom flow for APIs that are
#' further from the spec:
#' `oauth_flow_auth_code()` is a high-level wrapper that should work with APIs
#' that adhere relatively closely to the spec. When possible, it redirects users
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if "redirects users" is the right phrasing. Because, when this works, the user tends to be completely unaware of the brief existence of the local webserver. Maybe reword this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "redirects the browser"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the wording.

R/oauth-flow-auth-code.R Show resolved Hide resolved
@atheriel
Copy link
Collaborator Author

atheriel commented Jul 5, 2023

You're right, I don't feel like we can recommend using the gargle site in its current incarnation. The language on that page is very narrowly scoped to its target audience. I mentioned it mostly for demo purposes.

It's also my hope that the upcoming Workbench changes for redirects will make this pseudo-OOB workflow more uncommon -- it's really here as a last resort.

(As an aside: compatibility with gargle here might make it easier to refactor gargle to use httr2 one day.)

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

I think if we continue further down this road, we'll probably want a vignette on the pseudo-OOB approach.

R/oauth-flow-auth-code.R Outdated Show resolved Hide resolved
R/oauth-flow-auth-code.R Outdated Show resolved Hide resolved

# For backwards compatibility, fall back to the original redirect URL
# construction.
if (!is.null(host_name)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

tests/testthat/test-oauth-flow-auth-code.R Outdated Show resolved Hide resolved
R/oauth-flow-auth-code.R Show resolved Hide resolved
R/oauth-flow-auth-code.R Show resolved Hide resolved
R/oauth-flow-auth-code.R Outdated Show resolved Hide resolved
R/oauth-flow-auth-code.R Outdated Show resolved Hide resolved
@atheriel atheriel force-pushed the aj-oauth-auth-code-improvements branch 3 times, most recently from 478bc46 to 23bf886 Compare July 7, 2023 13:44
When sheparding a user through an interactive authorisation code flow to
access APIs protected by OAuth2, the oauth_flow_auth_code() function
runs a temporary webserver on localhost to capture the code. This is a
very common technique for OAuth client libraries.

Unfortunately, this doesn't work in environments where localhost is
probably inaccessible to users -- which is normally the case on RStudio
Server and Posit Workbench. Previously the oauth_flow_auth_code()
function would hang indefinitely in these environments.

This commit fixes this failure mode by attempting to detect when this
temporary localhost server will actually work and only runs one if it
will.

Of course, this doesn't resolve the issue for users that can't run a
localhost server -- so in this situation, we now allow users to manually
paste the authorisation code into the console from... somewhere. In the
likely event that they don't want to be redirected to localhost for this
step, this commit also fulfills the longstanding request for control
over the redirect URL.

The combination of a manual copy & paste step with a custom redirect URL
is what the gargle package terms the "pseudo out-of-band" flow, and the
design used here is largely based on that package.

Because of this compatibility, users could register the existing public
page at https://www.tidyverse.org/google-callback/ with their OAuth
provider, host their own clone, or create a variant.

Control over the redirect URL is backwards-compatible but does require
making the 'host_name' parameter deprecated, unfortunately.

Unit tests for new functions are included, as are various updates to the
documentation.

Fixes r-lib#165 and r-lib#176.

Signed-off-by: Aaron Jacobs <[email protected]>
@atheriel atheriel force-pushed the aj-oauth-auth-code-improvements branch from 23bf886 to cdb4cce Compare July 7, 2023 13:44
R/oauth-flow-auth-code.R Outdated Show resolved Hide resolved
DESCRIPTION Show resolved Hide resolved
R/oauth-flow-auth-code.R Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
@jonthegeek
Copy link
Contributor

This looks great! I can't wait for it to be merged!

host_ip = "127.0.0.1",
port = httpuv::randomPort()
port = httpuv::randomPort(),
redirect_uri = "http://localhost"
Copy link
Member

@hadley hadley Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should deprecate port too, replacing this default with something that automatically adds the random port; i.e. we make redirect_uri the primary argument and extract the other pieces using url_parse().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's what I said above, previously 😆. Let's leave this as is for this PR and consider a refactoring in a future PR: #291.

hadley added a commit that referenced this pull request Aug 29, 2023
@hadley hadley requested a review from jennybc August 29, 2023 22:39
@@ -56,9 +56,11 @@ req_oauth_auth_code <- function(req, client,
pkce = TRUE,
auth_params = list(),
token_params = list(),
type = c("desktop", "web"),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, maybe we should use type "hosted" here instead of type "web".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurred to me that maybe we could do this automatically based on whether or not the redirect_url was to localhost?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me merge this and then present a concrete proposal.

@hadley hadley merged commit 7f603ff into r-lib:main Aug 30, 2023
12 checks passed
@atheriel atheriel deleted the aj-oauth-auth-code-improvements branch September 25, 2023 14:39
atheriel added a commit to atheriel/httr2 that referenced this pull request Sep 26, 2023
For data scientists using local tools or working in the IDE,
authentication with APIs or databases that use the OAuth2 authorisation
code flow (e.g. Google, Azure, or GitHub) typically work by running a
temporary HTTP server on localhost and using that as the redirect URL.
When a user gets redirected to localhost, the temporary HTTP server
captures the code automatically and surfaces it to the caller.

Unfortunately, this doesn't work at all on Workbench, because you can't
just "redirect to localhost" like you can on your desktop.

In r-lib#248 and follow-ups we added a workaround for this issue that we term
the "pseudo out-of-band" flow -- that mechanism was pioneered by the
{gargle} package in response to Google deprecating the original
"out-of-band" flow.

This commit adds support for a *third* mechanism for the auth code flow
which uses a feature of the upcoming Workbench release that allows it to
serve as a redirect URL for arbitrary OAuth2 applications on a static
/oauth_redirect_callback endpoint. httr2 can retrieve any code sent to
this URL by calling a simple JSON API at a static /oauth_code endpoint.

This mechanism has strong appeal over the pseudo out-of-band flow
because it sidesteps the requirement that users copy & paste the code,
making it feel much more natural and automatic. It also avoids the need
to ask users to host a static "capture your code" page, a la
<https://www.tidyverse.org/google-callback/>, because every Workbench
instance now has one.

(This Workbench feature was explicitly designed for packages like httr2
to make use of so that the auth code flow starts feeling like magic once
again.)

I've decided to introduce client-specific environment variables here --
specifically, HTTR2_OAUTH_REDIRECT_URL and HTTR2_OAUTH_CODE_SOURCE_URL
-- rather than having platform-specific ones prefixed with `WORKBENCH_`.

If you're running a daily build of Workbench locally, you can test this
as follows:

    local({
      # Automatically determine the Workbench server's URL. We use
      # RSTUDIO_HTTP_REFERER here, which is only set in RStudio Pro sessions.
      url_split <- strsplit(Sys.getenv("RSTUDIO_HTTP_REFERER"), "/s/", fixed = TRUE)[[1]]
      if (nchar(url_split[1]) != 0L && length(url_split) == 2L) {
        base_url <- url_split[1]
        Sys.setenv(
          HTTR2_OAUTH_REDIRECT_URL = sprintf("%s/oauth_redirect_callback", base_url),
          HTTR2_OAUTH_CODE_SOURCE_URL = sprintf("%s/oauth_code", base_url)
        )
      }
    })

    client <- oauth_client(
      id = "28acfec0674bb3da9f38",
      secret = obfuscated(paste0(
        "J9iiGmyelHltyxqrHXW41ZZPZamyUNxSX1_uKnv",
        "PeinhhxET_7FfUs2X0LLKotXY2bpgOMoHRCo"
      )),
      token_url = "https://github.com/login/oauth/access_token",
      name = "hadley-oauth-test"
    )

    oauth_flow_auth_code(
      client, auth_url = "https://github.com/login/oauth/authorize"
    )

In the future we can set the environment variables in Workbench sessions
directly.

Unit tests are included.

Signed-off-by: Aaron Jacobs <[email protected]>
hadley pushed a commit that referenced this pull request Oct 8, 2023
For data scientists using local tools or working in the IDE,
authentication with APIs or databases that use the OAuth2 authorisation
code flow (e.g. Google, Azure, or GitHub) typically work by running a
temporary HTTP server on localhost and using that as the redirect URL.
When a user gets redirected to localhost, the temporary HTTP server
captures the code automatically and surfaces it to the caller.

Unfortunately, this doesn't work at all on Workbench, because you can't
just "redirect to localhost" like you can on your desktop.

In #248 and follow-ups we added a workaround for this issue that we term
the "pseudo out-of-band" flow -- that mechanism was pioneered by the
{gargle} package in response to Google deprecating the original
"out-of-band" flow.

This commit adds support for a *third* mechanism for the auth code flow
which uses a feature of the upcoming Workbench release that allows it to
serve as a redirect URL for arbitrary OAuth2 applications on a static
/oauth_redirect_callback endpoint. httr2 can retrieve any code sent to
this URL by calling a simple JSON API at a static /oauth_code endpoint.

This mechanism has strong appeal over the pseudo out-of-band flow
because it sidesteps the requirement that users copy & paste the code,
making it feel much more natural and automatic. It also avoids the need
to ask users to host a static "capture your code" page, a la
<https://www.tidyverse.org/google-callback/>, because every Workbench
instance now has one.

(This Workbench feature was explicitly designed for packages like httr2
to make use of so that the auth code flow starts feeling like magic once
again.)

I've decided to introduce client-specific environment variables here --
specifically, HTTR2_OAUTH_REDIRECT_URL and HTTR2_OAUTH_CODE_SOURCE_URL
-- rather than having platform-specific ones prefixed with `WORKBENCH_`.

If you're running a daily build of Workbench locally, you can test this
as follows:

    local({
      # Automatically determine the Workbench server's URL. We use
      # RSTUDIO_HTTP_REFERER here, which is only set in RStudio Pro sessions.
      url_split <- strsplit(Sys.getenv("RSTUDIO_HTTP_REFERER"), "/s/", fixed = TRUE)[[1]]
      if (nchar(url_split[1]) != 0L && length(url_split) == 2L) {
        base_url <- url_split[1]
        Sys.setenv(
          HTTR2_OAUTH_REDIRECT_URL = sprintf("%s/oauth_redirect_callback", base_url),
          HTTR2_OAUTH_CODE_SOURCE_URL = sprintf("%s/oauth_code", base_url)
        )
      }
    })

    client <- oauth_client(
      id = "28acfec0674bb3da9f38",
      secret = obfuscated(paste0(
        "J9iiGmyelHltyxqrHXW41ZZPZamyUNxSX1_uKnv",
        "PeinhhxET_7FfUs2X0LLKotXY2bpgOMoHRCo"
      )),
      token_url = "https://github.com/login/oauth/access_token",
      name = "hadley-oauth-test"
    )

    oauth_flow_auth_code(
      client, auth_url = "https://github.com/login/oauth/authorize"
    )

In the future we can set the environment variables in Workbench sessions
directly.

Unit tests are included.

Signed-off-by: Aaron Jacobs <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't provide custom redirect_uri in oauth_auth_code()
4 participants