Conversation
Implement the first batch of Phase 1 (Media & Rich Content) features: full audio playback support powered by rodio 0.22. Host (oxide-browser): - Add AudioEngine struct wrapping rodio's MixerDeviceSink and Player, lazily initialised on the first audio API call per tab - Register 10 new host functions on the "oxide" module: api_audio_play — decode and play in-memory bytes (WAV/MP3/OGG/FLAC) api_audio_play_url — fetch audio from a URL and play api_audio_pause — pause playback api_audio_resume — resume playback api_audio_stop — stop and clear queue api_audio_set_volume — set volume (0.0–2.0) api_audio_get_volume — query current volume api_audio_is_playing — check if audio is actively playing api_audio_position — get playback position in milliseconds api_audio_seek — seek to a position in milliseconds SDK (oxide-sdk): - Add FFI imports and safe Rust wrappers for all 10 audio functions Example (examples/audio-player): - Self-contained audio demo with tone pad buttons (A4/C5/E5/G5), custom frequency/duration tone generator with WAV synthesis in WASM, URL-based audio fetching, pause/resume/stop controls, volume slider, status display, and an animated visualiser bar Dependencies: - Add rodio 0.22.2 to oxide-browser (brings symphonia decoders for MP3, OGG/Vorbis, FLAC, WAV, AAC, and MP4/ISO) Made-with: Cursor
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThis PR adds audio playback support: a host-side AudioEngine using rodio, new oxide-sdk FFI wrappers, a playable audio-player example (synthesis, URL playback, UI), workspace and roadmap updates, CI additions for ALSA dev package, and a small toolbar UI tweak. Changes
Sequence DiagramsequenceDiagram
participant App as Audio Player App
participant SDK as oxide-sdk
participant Host as oxide-browser (Host)
participant Engine as AudioEngine
participant Rodio as rodio
participant HTTP as HTTP Thread
App->>SDK: audio_play_url(url)
SDK->>Host: api_audio_play_url(url_ptr, url_len)
Host->>HTTP: spawn thread to fetch bytes (blocking HTTP)
HTTP-->>Host: send fetched bytes via channel
Host->>Engine: ensure AudioEngine exists (lazy init)
Engine->>Rodio: create MixerDeviceSink & Player
Host->>Rodio: decode bytes -> Source
Host->>Engine: player.play(source) (may target channel)
Rodio-->>Host: playback started
Host-->>SDK: return success (0)
SDK-->>App: success
App->>SDK: audio_position()
SDK->>Host: api_audio_position()
Host->>Engine: query channel/player position
Engine->>Rodio: request position
Rodio-->>Host: position ms
Host-->>SDK: return position
SDK-->>App: update UI
App->>SDK: audio_pause()
SDK->>Host: api_audio_pause()
Host->>Engine: player.pause()
Engine->>Rodio: pause playback
Rodio-->>Host: paused
Host-->>SDK: ack
SDK-->>App: status = "Paused"
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/audio-player/src/lib.rs`:
- Around line 64-79: The button handlers currently call play_tone(...) and
unconditionally set unsafe LAST_NOTE, which hides playback failures; change each
handler (e.g., the ui_button blocks that call play_tone for NOTE_A4, NOTE_C5,
NOTE_E5, NOTE_G5 and the other occurrences mentioned) to check the
boolean/Result returned from play_tone (or propagate audio_play errors to return
a success flag) and only update unsafe LAST_NOTE = "..." when play_tone
indicates success; ensure play_tone/audio_play surfaces host errors (bool or
Result) so the callers can decide whether to set LAST_NOTE.
In `@oxide-browser/src/capabilities.rs`:
- Around line 1641-1669: The current audio_play_url flow spawns a thread that
fully buffers resp.bytes() and then the caller immediately blocks on rx.recv(),
causing large/slow downloads to stall and use unbounded memory; change this to
stream and/or bound the response: in the spawned closure (where fetch_url,
tx/rx, reqwest::blocking::Client and resp are used) replace resp.bytes() with
incremental reads (e.g., resp.take(MAX_BYTES).read(&mut buf) in a loop or using
resp.copy_to with a bounded reader) and send each chunk down the channel as it
becomes available, or enforce a capped total via resp.take(MAX_BYTES) to prevent
OOM; on the receiver side (where rx.recv() is used) start playback as soon as
the first chunk arrives rather than waiting for the entire payload, and preserve
error paths via console_log/ConsoleLevel::Error for failures—alternatively, if
you don't want streaming, rename the function/contract from audio_play_url to
indicate it is a buffered fetch-and-play.
- Around line 32-44: In play_bytes, avoid clearing the current player before
verifying the new payload decodes: call rodio::Decoder::try_from on a BufReader
wrapping the Cursor over the provided data first, and only if try_from returns
Ok(source) then call self.player.clear(), self.player.append(source) and
self.player.play(); on Err(_) return false without mutating the existing player.
Reference the play_bytes method and the use of rodio::Decoder::try_from,
std::io::Cursor, and self.player.clear/append/play to locate where to reorder
operations.
- Around line 1593-1597: The audio engine is only lazily initialized in play
paths, so pre-play volume changes via audio_set_volume()/audio_get_volume() are
lost; update the code that accesses caller.data().audio (the block using let
audio = caller.data().audio.clone(); let mut guard = audio.lock().unwrap();) to
ensure the AudioEngine is initialized there as well (call AudioEngine::try_new()
and set *guard when None) or alternatively store the requested volume in
HostState until AudioEngine is created and apply it when AudioEngine::try_new()
runs; make the change both where the guard is checked (the shown block) and the
analogous locations referenced (around the audio_set_volume/audio_get_volume
code paths).
In `@ROADMAP.md`:
- Around line 37-40: Update the roadmap bullets to reflect the shipped API:
change the function signature `audio_play(data, format)` to `audio_play(data)`
(remove the `format` parameter), and update the volume range in
`audio_set_volume(level)` / `audio_get_volume()` from "0.0–1.0" to "0.0–2.0";
keep the other entries (`audio_play_url(url)`,
`audio_pause()`/`audio_resume()`/`audio_stop()`) as-is so the list matches the
public SDK surface.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6cd6ec37-75b8-44c8-9d25-d4ef262105e9
📒 Files selected for processing (8)
Cargo.tomlROADMAP.mdexamples/audio-player/Cargo.tomlexamples/audio-player/src/lib.rsoxide-browser/Cargo.tomloxide-browser/src/capabilities.rsoxide-browser/src/ui.rsoxide-sdk/src/lib.rs
| if ui_button(BTN_TONE_A, 20.0, pad_y, pad_w, pad_h, "A4 440Hz") { | ||
| play_tone(NOTE_A4, 1.5); | ||
| unsafe { LAST_NOTE = "A4 (440 Hz)" }; | ||
| } | ||
| if ui_button(BTN_TONE_C, 120.0, pad_y, pad_w, pad_h, "C5 523Hz") { | ||
| play_tone(NOTE_C5, 1.5); | ||
| unsafe { LAST_NOTE = "C5 (523 Hz)" }; | ||
| } | ||
| if ui_button(BTN_TONE_E, 220.0, pad_y, pad_w, pad_h, "E5 659Hz") { | ||
| play_tone(NOTE_E5, 1.5); | ||
| unsafe { LAST_NOTE = "E5 (659 Hz)" }; | ||
| } | ||
| if ui_button(BTN_TONE_G, 320.0, pad_y, pad_w, pad_h, "G5 784Hz") { | ||
| play_tone(NOTE_G5, 1.5); | ||
| unsafe { LAST_NOTE = "G5 (784 Hz)" }; | ||
| } |
There was a problem hiding this comment.
Bubble audio_play failures back to the UI state.
play_tone() drops the host return code, so the callers here still stamp LAST_NOTE as if playback started. If the device is unavailable or the host rejects the bytes, the demo becomes misleading.
💡 One way to wire this up
-fn play_tone(frequency: f32, duration_secs: f32) {
+fn play_tone(frequency: f32, duration_secs: f32) -> bool {
let wav = generate_wav(frequency, duration_secs);
- audio_play(&wav);
+ audio_play(&wav) == 0
}Then only update LAST_NOTE in the button handlers when play_tone(...) returns true.
Also applies to: 132-135, 312-315
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/audio-player/src/lib.rs` around lines 64 - 79, The button handlers
currently call play_tone(...) and unconditionally set unsafe LAST_NOTE, which
hides playback failures; change each handler (e.g., the ui_button blocks that
call play_tone for NOTE_A4, NOTE_C5, NOTE_E5, NOTE_G5 and the other occurrences
mentioned) to check the boolean/Result returned from play_tone (or propagate
audio_play errors to return a success flag) and only update unsafe LAST_NOTE =
"..." when play_tone indicates success; ensure play_tone/audio_play surfaces
host errors (bool or Result) so the callers can decide whether to set LAST_NOTE.
| let (tx, rx) = std::sync::mpsc::sync_channel::<Result<Vec<u8>, String>>(1); | ||
| let fetch_url = url.clone(); | ||
| std::thread::spawn(move || { | ||
| let result = (|| -> Result<Vec<u8>, String> { | ||
| let client = reqwest::blocking::Client::builder() | ||
| .timeout(Duration::from_secs(30)) | ||
| .build() | ||
| .map_err(|e| e.to_string())?; | ||
| let resp = client.get(&fetch_url).send().map_err(|e| e.to_string())?; | ||
| if !resp.status().is_success() { | ||
| return Err(format!("HTTP {}", resp.status())); | ||
| } | ||
| resp.bytes().map(|b| b.to_vec()).map_err(|e| e.to_string()) | ||
| })(); | ||
| let _ = tx.send(result); | ||
| }); | ||
|
|
||
| let data = match rx.recv() { | ||
| Ok(Ok(bytes)) => bytes, | ||
| Ok(Err(e)) => { | ||
| console_log( | ||
| &caller.data().console, | ||
| ConsoleLevel::Error, | ||
| format!("[AUDIO] Fetch error: {e}"), | ||
| ); | ||
| return -1; | ||
| } | ||
| Err(_) => return -1, | ||
| }; |
There was a problem hiding this comment.
audio_play_url is still a full, blocking download.
Spawning a worker thread does not make this asynchronous here, because Line 1658 waits on recv() immediately, and Lines 1649-1653 buffer the entire body with bytes() before playback starts. Slow or large URLs will stall this call path and can consume unbounded memory. Either stream/bound the response, or rename the contract to buffered fetch-and-play.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@oxide-browser/src/capabilities.rs` around lines 1641 - 1669, The current
audio_play_url flow spawns a thread that fully buffers resp.bytes() and then the
caller immediately blocks on rx.recv(), causing large/slow downloads to stall
and use unbounded memory; change this to stream and/or bound the response: in
the spawned closure (where fetch_url, tx/rx, reqwest::blocking::Client and resp
are used) replace resp.bytes() with incremental reads (e.g.,
resp.take(MAX_BYTES).read(&mut buf) in a loop or using resp.copy_to with a
bounded reader) and send each chunk down the channel as it becomes available, or
enforce a capped total via resp.take(MAX_BYTES) to prevent OOM; on the receiver
side (where rx.recv() is used) start playback as soon as the first chunk arrives
rather than waiting for the entire payload, and preserve error paths via
console_log/ConsoleLevel::Error for failures—alternatively, if you don't want
streaming, rename the function/contract from audio_play_url to indicate it is a
buffered fetch-and-play.
| - [x] `audio_play(data, format)` — decode and play audio buffers (MP3, OGG, WAV, FLAC) | ||
| - [x] `audio_play_url(url)` — stream audio from a URL | ||
| - [x] `audio_pause()` / `audio_resume()` / `audio_stop()` — playback control | ||
| - [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–1.0) |
There was a problem hiding this comment.
Update these bullets to match the shipped API.
Line 37 still documents audio_play(data, format) even though the SDK now exposes audio_play(data), and Line 40 still says volume is 0.0–1.0 while the host accepts up to 2.0. Keeping the roadmap aligned with the public API will avoid misleading app authors.
📝 Suggested doc fix
-- [x] `audio_play(data, format)` — decode and play audio buffers (MP3, OGG, WAV, FLAC)
+- [x] `audio_play(data)` — decode and play audio buffers (MP3, OGG, WAV, FLAC)
-- [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–1.0)
+- [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–2.0)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - [x] `audio_play(data, format)` — decode and play audio buffers (MP3, OGG, WAV, FLAC) | |
| - [x] `audio_play_url(url)` — stream audio from a URL | |
| - [x] `audio_pause()` / `audio_resume()` / `audio_stop()` — playback control | |
| - [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–1.0) | |
| - [x] `audio_play(data)` — decode and play audio buffers (MP3, OGG, WAV, FLAC) | |
| - [x] `audio_play_url(url)` — stream audio from a URL | |
| - [x] `audio_pause()` / `audio_resume()` / `audio_stop()` — playback control | |
| - [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–2.0) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ROADMAP.md` around lines 37 - 40, Update the roadmap bullets to reflect the
shipped API: change the function signature `audio_play(data, format)` to
`audio_play(data)` (remove the `format` parameter), and update the volume range
in `audio_set_volume(level)` / `audio_get_volume()` from "0.0–1.0" to "0.0–2.0";
keep the other entries (`audio_play_url(url)`,
`audio_pause()`/`audio_resume()`/`audio_stop()`) as-is so the list matches the
public SDK surface.
Polish the audio API with three remaining Phase 1 audio features: Duration tracking: - Capture total_duration() from the rodio Decoder during playback - Expose via api_audio_duration() host function (returns ms, 0 if unknown) Loop playback: - Add per-channel looping flag toggled via api_audio_set_loop() - When enabled, sources are wrapped in repeat_infinite() before appending to the player, looping until explicitly stopped Multi-channel audio: - Restructure AudioEngine from a single Player to a HashMap<u32, AudioChannel> where each channel has its own Player, duration, and loop state - All existing api_audio_* functions continue to operate on channel 0 - New channel-specific functions for simultaneous playback: api_audio_channel_play — play audio on any channel (0+) api_audio_channel_stop — stop a specific channel api_audio_channel_set_volume — per-channel volume control - Multiple channels mix to the same device via rodio's mixer SDK: add audio_duration(), audio_set_loop(), audio_channel_play(), audio_channel_stop(), audio_channel_set_volume() wrappers Example: update audio-player demo with loop checkbox, duration display, and SFX channel section (blip/beep/chirp sounds that layer over main audio) Made-with: Cursor
Implement the first batch of Phase 1 (Media & Rich Content) features:
full audio playback support powered by rodio 0.22.
0324.mp4
Summary by CodeRabbit
New Features
Improvements
Documentation
Chores