Skip to content

feat: Preliminary Support for Viper-backed OAuth in Sonos #2077

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dljsjr
Copy link
Contributor

@dljsjr dljsjr commented Apr 15, 2025

Type of Change

  • New feature

Checklist

  • I have performed a self-review of my code
  • I have commented my code in hard-to-understand areas
  • I have verified my changes by testing with a device or have communicated a plan for testing (see note)
  • I am adding new behavior, such as adding a sub-driver, and have added and run new unit tests to cover the new behavior (see note)

Description of Change

This change adds preliminary support for the Sonos OAuth requirements that will take effect in August, allowing us to integrate with the new Hub functionality in the upcoming release that will allow for the Sonos driver to get an OAuth token from the mobile app.

Note

There are a few small TODO's here based on the shape of errors. We didn't have the ability to readily trigger the failure case for this, and we need to be able to do that to finish those TODO's.

We have a way forward on this now, though, so we'll be able to fill those in shortly.

Copy link

Copy link

github-actions bot commented Apr 15, 2025

Test Results

   66 files  ±0    425 suites  ±0   0s ⏱️ ±0s
2 175 tests ±0  2 175 ✅ ±0  0 💤 ±0  0 ❌ ±0 
3 708 runs  ±0  3 708 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 5fe0c4f. ± Comparison against base commit d773c03.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Apr 15, 2025

Minimum allowed coverage is 90%

Generated by 🐒 cobertura-action against 5fe0c4f

@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch from 90f2310 to 3e52f1a Compare April 15, 2025 23:01
end

if maybe_token then
wss_msg_header.authorization = string.format("Bearer %s", maybe_token.access_token)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the WSS connection, the Bearer token is part of the JSON payload's "header", not the HTTP headers. And the authorization key must be lowercase, according to the docs.

local start_success = sonos_conn:start()
if start_success then return end
if sonos_conn.driver.waiting_for_token and token_receive_handle then
local token, channel_error = token_receive_handle:receive()
Copy link
Contributor Author

@dljsjr dljsjr Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that we actually want to handle asking for the token in a loop; that said, I believe that this won't actually loop.

get_oauth_token() calls through to security.get_sonos_oauth_token(), which we have designed to be completely out-of-band async. So we fire-and-forget, with the return value either being nothing or an error due to perms/API compat.

The token_receive_handle is half of a cosock channel. We haven't put a timeout on it, and it's on a non-main thread, so it should just yield forever until it gets something. And that "something" should only come through if the send half of the channel gets dropped, or if the token arrives on the environment info update in the augmented store.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to think about this, but I don't know if no timeout is truly what we want. I could see a very large timeout, but at the same time I understand the potential load caused by thousands of hubs repeatedly checking for a token when the user doesn't link their account.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should have a timeout on this receive to avoid locking up entirely.


self:refresh_subscriptions(reply_tx)

local reply = reply_rx:receive()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the docs, the WSS connection itself won't actually be refused if we're missing credentials. Instead, the first command we try to send (which is a subscribe command in our case) will let us know if the auth fails.

So we've added this reply channel API to be able to make the initial subscription establishment in the :start() method wait until we know that the subscription is allowed.

@@ -178,12 +180,22 @@ local function scan_for_ssdp_updates(driver)
end)
end

local startup_state_received = false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want all device activity to be more or less frozen until we get the start up state/initial contents of the augmented data store, so we have to queue the devices up.

if not startup_state_received then
log.warn(string.format("startup state not yet received, delaying init for %s", (device and device.label or "<unknown device>")))
if device and device.id then
devices_waiting_for_startup_state[device.id] = device
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's said queue.

get_oauth_token_receive_handle = function(self)
oauth_token_tx:subscribe()
end,
get_oauth_token = function(self)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_oauth_token will only return a token if we actually have one. Otherwise, it'll trigger the OAuth activity in the hub firwmare, and immediately return with an error indicating why.

@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch from 3e52f1a to 49cb2c5 Compare April 15, 2025 23:17
if decode_success then
self.oauth.token = decoded
self.waiting_for_token = false
self.oauth_token_tx:send(decoded)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where token receipt actually happens. Notify anyone waiting for one.

-- We use the same handler for added and init here because at the time of authoring
-- this driver, there is a bug with LAN Edge Drivers where `init` may not be called
-- on every device that gets created using `try_create_device`. This makes sure that
-- a device is fully initialized whether we come from fresh join or restart.
-- See: https://smartthings.atlassian.net/browse/CHAD-9683
local function _initialize_device(driver, device)
if not startup_state_received then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need to do an API version check. Otherwise on old hubs this will be broken. Those old integrations will fail come august (unless the person enabled unsecure integrations from sonos), but I think we should be able to handle the driver running on an older version of the lua libs.

Update Lunchbox to latest version from Hue

chore: Improve type annotations
@dljsjr dljsjr force-pushed the feat/sonos-oauth-support branch from 9a22827 to 5fe0c4f Compare April 30, 2025 21:54
@@ -0,0 +1,445 @@
---@module 'result'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be checked in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

Copy link
Contributor

@cjswedes cjswedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did my best reviewing this, but it was a difficult review given the refactor and new functionality all in the same change set. I think it would probably be helpful for me to get some high level descriptions of the cosock tasks that are running and their lifecycles.

-- token has not expired yet
if now < expiration then
-- token is expiring soon, so we pre-emptively refresh
if math.abs(expiration - now) < ONE_HOUR_IN_SECONDS then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For how long do the tokens are the tokens valid?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just continue with the normal error patterns that we use and are pretty universal to Lua? Although the Result type makes sense for rust devs, it will confuse most of our community, thus it seems like extra memory for no clear functionality gain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does look like its only used in the SSDP task, which in itself is difficult for anyone but a cosock maintainer to understand. Im still not sure what extra functionality it provides though.

Comment on lines -25 to -35
-- SSDP Messages use the HTTP/1.1 Header Field rules described in RFC 2616, 4.2: https://datatracker.ietf.org/doc/html/rfc2616#section-4.2
-- This pattern extracts the Key/Value pairs in to a Lua table via the two capture groups.
-- The key capture group is composed entirely of a negating matcher to exclude illegal characters, ending at the `:`.
-- The RFC states that after the colon there may be any arbitrary amount of leading space between the colon
-- and the value, and that the value shouldn't have any trailing whitespace, so we exclude those as well.
-- The original Luncheon implementation of this Lua Pattern used iteration and detected the `;` separator
-- that indicates key/value parameters, however, we don't make that distinction here and instead leave parsing
-- values with parameters to the consumers of the output of this function.
local k, v = string.match(l, '([^%c()<>@,;:\\"/%[%]?={} \t]+):%s*(.-)%s*$')
if k == nil or k == "" then
return nil, string.format("Couldn't parse header/value pair for line %q", l)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this handled now?

string.format("setwaker: SSDP search can only wake on receive readiness, got unsupported wake kind: %s.", kind))

assert(self.waker_ref == nil or waker == nil,
"Waker already set, cannot await SSDP serach instance from multiple locations.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
"Waker already set, cannot await SSDP serach instance from multiple locations.")
"Waker already set, cannot await SSDP search instance from multiple locations.")

---@return table<string,string>
---@overload fun(use_legacy_api_key: boolean, access_token: string?) table<string,string>
---@overload fun(access_token: string?) table<string,string>
function SonosApi.make_rest_headers(...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats there a reason for having a variable number of arguments instead of just having the arguments be listed? It makes it a little difficult to understand what is happening. make_rest_headers(access_token, use_legacy_api_key) seems like it would be a clearer signature.

if body.errorCode == "ERROR_NOT_AUTHORIZED" then
local household_id, player_id = driver.sonos:get_player_for_device(device)
device.log.warn(string.format("WebSocket connection no longer authorized, disconnecting"))
local security_result, security_err = driver:request_oauth_token()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reconnect loop will do this too, so it seems like it isn't needed here.

self:refresh_subscriptions()
local reply_tx, reply_rx = cosock.channel.new()

log.info("before refresh subscriptions")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: looks like these are all debug logs. Not sure if they are necessary or if they just need to be at a lower level

local connection_successful = true
if not connection_successful then
if not self.driver:is_waiting_for_oauth_token() then
local err = self.driver:get_oauth_token()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be:

Suggested change
local err = self.driver:get_oauth_token()
local _, err = self.driver:get_oauth_token()

device.log.debug("sending player info request")
driver.ssdp_task:get_player_info(player_info_tx, mac_addr)
device.log.debug("selecton on player info receive")
local recv_ready, _, select_err = cosock.socket.select({ player_info_rx }, nil, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im guessing its been this way and there aren't any changes, just a moving of code, but why do we use select for one socket? Wouldn't calling player_info_rx:receive() have the same effect?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: needs to be removed prior to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants