Skip to content

feat: add audio playback API (Phase 1)#10

Merged
niklabh merged 7 commits intomainfrom
audio
Mar 24, 2026
Merged

feat: add audio playback API (Phase 1)#10
niklabh merged 7 commits intomainfrom
audio

Conversation

@niklabh
Copy link
Copy Markdown
Owner

@niklabh niklabh commented Mar 24, 2026

Implement the first batch of Phase 1 (Media & Rich Content) features:
full audio playback support powered by rodio 0.22.

0324.mp4
Host (oxide-browser):
- Add AudioEngine struct wrapping rodio's MixerDeviceSink and Player,
  lazily initialised on the first audio API call per tab
- Register 10 new host functions on the "oxide" module:
  api_audio_play         — decode and play in-memory bytes (WAV/MP3/OGG/FLAC)
  api_audio_play_url     — fetch audio from a URL and play
  api_audio_pause        — pause playback
  api_audio_resume       — resume playback
  api_audio_stop         — stop and clear queue
  api_audio_set_volume   — set volume (0.0–2.0)
  api_audio_get_volume   — query current volume
  api_audio_is_playing   — check if audio is actively playing
  api_audio_position     — get playback position in milliseconds
  api_audio_seek         — seek to a position in milliseconds

SDK (oxide-sdk):
- Add FFI imports and safe Rust wrappers for all 10 audio functions

Example (examples/audio-player):
- Self-contained audio demo with tone pad buttons (A4/C5/E5/G5),
  custom frequency/duration tone generator with WAV synthesis in WASM,
  URL-based audio fetching, pause/resume/stop controls, volume slider,
  status display, and an animated visualiser bar

Dependencies:
- Add rodio 0.22.2 to oxide-browser (brings symphonia decoders for
  MP3, OGG/Vorbis, FLAC, WAV, AAC, and MP4/ISO)

Summary by CodeRabbit

  • New Features

    • Full audio playback support: play/pause/resume/stop, volume, seek, position, duration, looping, and multi-channel playback.
    • Added an interactive audio player example with tone synthesis, URL streaming, transport, volume, SFX channels, and visualizer.
  • Improvements

    • Toolbar bookmark control refined for better styling and hover/click feedback.
  • Documentation

    • ROADMAP updated: core audio API checklist items marked complete.
  • Chores

    • Added audio backend dependency and updated CI to install required audio system package.

niklabh added 3 commits March 22, 2026 00:57
Implement the first batch of Phase 1 (Media & Rich Content) features:
full audio playback support powered by rodio 0.22.

Host (oxide-browser):
- Add AudioEngine struct wrapping rodio's MixerDeviceSink and Player,
  lazily initialised on the first audio API call per tab
- Register 10 new host functions on the "oxide" module:
  api_audio_play         — decode and play in-memory bytes (WAV/MP3/OGG/FLAC)
  api_audio_play_url     — fetch audio from a URL and play
  api_audio_pause        — pause playback
  api_audio_resume       — resume playback
  api_audio_stop         — stop and clear queue
  api_audio_set_volume   — set volume (0.0–2.0)
  api_audio_get_volume   — query current volume
  api_audio_is_playing   — check if audio is actively playing
  api_audio_position     — get playback position in milliseconds
  api_audio_seek         — seek to a position in milliseconds

SDK (oxide-sdk):
- Add FFI imports and safe Rust wrappers for all 10 audio functions

Example (examples/audio-player):
- Self-contained audio demo with tone pad buttons (A4/C5/E5/G5),
  custom frequency/duration tone generator with WAV synthesis in WASM,
  URL-based audio fetching, pause/resume/stop controls, volume slider,
  status display, and an animated visualiser bar

Dependencies:
- Add rodio 0.22.2 to oxide-browser (brings symphonia decoders for
  MP3, OGG/Vorbis, FLAC, WAV, AAC, and MP4/ISO)

Made-with: Cursor
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR adds audio playback support: a host-side AudioEngine using rodio, new oxide-sdk FFI wrappers, a playable audio-player example (synthesis, URL playback, UI), workspace and roadmap updates, CI additions for ALSA dev package, and a small toolbar UI tweak.

Changes

Cohort / File(s) Summary
Workspace & Roadmap
Cargo.toml, ROADMAP.md
Added examples/audio-player to workspace members; marked multiple audio checklist items completed in the roadmap (play, play_url, pause/resume/stop, volume, seek, duration, loop, multi-channel).
Example Crate
examples/audio-player/Cargo.toml, examples/audio-player/src/lib.rs
New audio-player example (cdylib) implementing UI, on-the-fly WAV synthesis, URL playback, transport/volume controls, SFX channel mixing, visualizer, and exported start_app / on_frame.
Browser Audio Engine & Host APIs
oxide-browser/Cargo.toml, oxide-browser/src/capabilities.rs
Added rodio dependency; introduced AudioEngine, channel model, lazy init in HostState.audio, and many api_audio_* host functions (byte & URL play, channel play/stop/volume, default-channel pause/resume/stop/volume/is_playing/position/seek/duration/loop) with explicit error codes and threaded URL fetch.
SDK Wrappers
oxide-sdk/src/lib.rs
Added safe Rust wrappers for the new host FFI: audio_play, audio_play_url, playback controls, per-channel play/stop/volume, status/position/seek/duration and loop toggling.
UI Refinement
oxide-browser/src/ui.rs
Replaced bookmark toolbar egui::Button with a custom-painted click area using ui.allocate_exact_size and ui.painter(); adjusted click/hover handling to new response.
CI / Release Workflows
.github/workflows/ci.yml, .github/workflows/release.yml
Added libasound2-dev to Linux runner apt package list to provide ALSA development headers for audio builds.

Sequence Diagram

sequenceDiagram
    participant App as Audio Player App
    participant SDK as oxide-sdk
    participant Host as oxide-browser (Host)
    participant Engine as AudioEngine
    participant Rodio as rodio
    participant HTTP as HTTP Thread

    App->>SDK: audio_play_url(url)
    SDK->>Host: api_audio_play_url(url_ptr, url_len)
    Host->>HTTP: spawn thread to fetch bytes (blocking HTTP)
    HTTP-->>Host: send fetched bytes via channel
    Host->>Engine: ensure AudioEngine exists (lazy init)
    Engine->>Rodio: create MixerDeviceSink & Player
    Host->>Rodio: decode bytes -> Source
    Host->>Engine: player.play(source) (may target channel)
    Rodio-->>Host: playback started
    Host-->>SDK: return success (0)
    SDK-->>App: success

    App->>SDK: audio_position()
    SDK->>Host: api_audio_position()
    Host->>Engine: query channel/player position
    Engine->>Rodio: request position
    Rodio-->>Host: position ms
    Host-->>SDK: return position
    SDK-->>App: update UI

    App->>SDK: audio_pause()
    SDK->>Host: api_audio_pause()
    Host->>Engine: player.pause()
    Engine->>Rodio: pause playback
    Rodio-->>Host: paused
    Host-->>SDK: ack
    SDK-->>App: status = "Paused"
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I stitched a song from tiny hops of code,

I twitched my ears as audio flowed.
A synth, a blip, a URL’s hum,
I wiggle, play, and then I’m done—
Hooray! the playground filled with sound. 🎶

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: implementing audio playback API as Phase 1 of the roadmap. It accurately reflects the primary purpose of the pull request.
Docstring Coverage ✅ Passed Docstring coverage is 96.30% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch audio

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/audio-player/src/lib.rs`:
- Around line 64-79: The button handlers currently call play_tone(...) and
unconditionally set unsafe LAST_NOTE, which hides playback failures; change each
handler (e.g., the ui_button blocks that call play_tone for NOTE_A4, NOTE_C5,
NOTE_E5, NOTE_G5 and the other occurrences mentioned) to check the
boolean/Result returned from play_tone (or propagate audio_play errors to return
a success flag) and only update unsafe LAST_NOTE = "..." when play_tone
indicates success; ensure play_tone/audio_play surfaces host errors (bool or
Result) so the callers can decide whether to set LAST_NOTE.

In `@oxide-browser/src/capabilities.rs`:
- Around line 1641-1669: The current audio_play_url flow spawns a thread that
fully buffers resp.bytes() and then the caller immediately blocks on rx.recv(),
causing large/slow downloads to stall and use unbounded memory; change this to
stream and/or bound the response: in the spawned closure (where fetch_url,
tx/rx, reqwest::blocking::Client and resp are used) replace resp.bytes() with
incremental reads (e.g., resp.take(MAX_BYTES).read(&mut buf) in a loop or using
resp.copy_to with a bounded reader) and send each chunk down the channel as it
becomes available, or enforce a capped total via resp.take(MAX_BYTES) to prevent
OOM; on the receiver side (where rx.recv() is used) start playback as soon as
the first chunk arrives rather than waiting for the entire payload, and preserve
error paths via console_log/ConsoleLevel::Error for failures—alternatively, if
you don't want streaming, rename the function/contract from audio_play_url to
indicate it is a buffered fetch-and-play.
- Around line 32-44: In play_bytes, avoid clearing the current player before
verifying the new payload decodes: call rodio::Decoder::try_from on a BufReader
wrapping the Cursor over the provided data first, and only if try_from returns
Ok(source) then call self.player.clear(), self.player.append(source) and
self.player.play(); on Err(_) return false without mutating the existing player.
Reference the play_bytes method and the use of rodio::Decoder::try_from,
std::io::Cursor, and self.player.clear/append/play to locate where to reorder
operations.
- Around line 1593-1597: The audio engine is only lazily initialized in play
paths, so pre-play volume changes via audio_set_volume()/audio_get_volume() are
lost; update the code that accesses caller.data().audio (the block using let
audio = caller.data().audio.clone(); let mut guard = audio.lock().unwrap();) to
ensure the AudioEngine is initialized there as well (call AudioEngine::try_new()
and set *guard when None) or alternatively store the requested volume in
HostState until AudioEngine is created and apply it when AudioEngine::try_new()
runs; make the change both where the guard is checked (the shown block) and the
analogous locations referenced (around the audio_set_volume/audio_get_volume
code paths).

In `@ROADMAP.md`:
- Around line 37-40: Update the roadmap bullets to reflect the shipped API:
change the function signature `audio_play(data, format)` to `audio_play(data)`
(remove the `format` parameter), and update the volume range in
`audio_set_volume(level)` / `audio_get_volume()` from "0.0–1.0" to "0.0–2.0";
keep the other entries (`audio_play_url(url)`,
`audio_pause()`/`audio_resume()`/`audio_stop()`) as-is so the list matches the
public SDK surface.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6cd6ec37-75b8-44c8-9d25-d4ef262105e9

📥 Commits

Reviewing files that changed from the base of the PR and between 4bbea77 and a8e7ff9.

📒 Files selected for processing (8)
  • Cargo.toml
  • ROADMAP.md
  • examples/audio-player/Cargo.toml
  • examples/audio-player/src/lib.rs
  • oxide-browser/Cargo.toml
  • oxide-browser/src/capabilities.rs
  • oxide-browser/src/ui.rs
  • oxide-sdk/src/lib.rs

Comment on lines +64 to +79
if ui_button(BTN_TONE_A, 20.0, pad_y, pad_w, pad_h, "A4 440Hz") {
play_tone(NOTE_A4, 1.5);
unsafe { LAST_NOTE = "A4 (440 Hz)" };
}
if ui_button(BTN_TONE_C, 120.0, pad_y, pad_w, pad_h, "C5 523Hz") {
play_tone(NOTE_C5, 1.5);
unsafe { LAST_NOTE = "C5 (523 Hz)" };
}
if ui_button(BTN_TONE_E, 220.0, pad_y, pad_w, pad_h, "E5 659Hz") {
play_tone(NOTE_E5, 1.5);
unsafe { LAST_NOTE = "E5 (659 Hz)" };
}
if ui_button(BTN_TONE_G, 320.0, pad_y, pad_w, pad_h, "G5 784Hz") {
play_tone(NOTE_G5, 1.5);
unsafe { LAST_NOTE = "G5 (784 Hz)" };
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Bubble audio_play failures back to the UI state.

play_tone() drops the host return code, so the callers here still stamp LAST_NOTE as if playback started. If the device is unavailable or the host rejects the bytes, the demo becomes misleading.

💡 One way to wire this up
-fn play_tone(frequency: f32, duration_secs: f32) {
+fn play_tone(frequency: f32, duration_secs: f32) -> bool {
     let wav = generate_wav(frequency, duration_secs);
-    audio_play(&wav);
+    audio_play(&wav) == 0
 }

Then only update LAST_NOTE in the button handlers when play_tone(...) returns true.

Also applies to: 132-135, 312-315

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/audio-player/src/lib.rs` around lines 64 - 79, The button handlers
currently call play_tone(...) and unconditionally set unsafe LAST_NOTE, which
hides playback failures; change each handler (e.g., the ui_button blocks that
call play_tone for NOTE_A4, NOTE_C5, NOTE_E5, NOTE_G5 and the other occurrences
mentioned) to check the boolean/Result returned from play_tone (or propagate
audio_play errors to return a success flag) and only update unsafe LAST_NOTE =
"..." when play_tone indicates success; ensure play_tone/audio_play surfaces
host errors (bool or Result) so the callers can decide whether to set LAST_NOTE.

Comment on lines +1641 to +1669
let (tx, rx) = std::sync::mpsc::sync_channel::<Result<Vec<u8>, String>>(1);
let fetch_url = url.clone();
std::thread::spawn(move || {
let result = (|| -> Result<Vec<u8>, String> {
let client = reqwest::blocking::Client::builder()
.timeout(Duration::from_secs(30))
.build()
.map_err(|e| e.to_string())?;
let resp = client.get(&fetch_url).send().map_err(|e| e.to_string())?;
if !resp.status().is_success() {
return Err(format!("HTTP {}", resp.status()));
}
resp.bytes().map(|b| b.to_vec()).map_err(|e| e.to_string())
})();
let _ = tx.send(result);
});

let data = match rx.recv() {
Ok(Ok(bytes)) => bytes,
Ok(Err(e)) => {
console_log(
&caller.data().console,
ConsoleLevel::Error,
format!("[AUDIO] Fetch error: {e}"),
);
return -1;
}
Err(_) => return -1,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

audio_play_url is still a full, blocking download.

Spawning a worker thread does not make this asynchronous here, because Line 1658 waits on recv() immediately, and Lines 1649-1653 buffer the entire body with bytes() before playback starts. Slow or large URLs will stall this call path and can consume unbounded memory. Either stream/bound the response, or rename the contract to buffered fetch-and-play.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@oxide-browser/src/capabilities.rs` around lines 1641 - 1669, The current
audio_play_url flow spawns a thread that fully buffers resp.bytes() and then the
caller immediately blocks on rx.recv(), causing large/slow downloads to stall
and use unbounded memory; change this to stream and/or bound the response: in
the spawned closure (where fetch_url, tx/rx, reqwest::blocking::Client and resp
are used) replace resp.bytes() with incremental reads (e.g.,
resp.take(MAX_BYTES).read(&mut buf) in a loop or using resp.copy_to with a
bounded reader) and send each chunk down the channel as it becomes available, or
enforce a capped total via resp.take(MAX_BYTES) to prevent OOM; on the receiver
side (where rx.recv() is used) start playback as soon as the first chunk arrives
rather than waiting for the entire payload, and preserve error paths via
console_log/ConsoleLevel::Error for failures—alternatively, if you don't want
streaming, rename the function/contract from audio_play_url to indicate it is a
buffered fetch-and-play.

Comment on lines +37 to +40
- [x] `audio_play(data, format)` — decode and play audio buffers (MP3, OGG, WAV, FLAC)
- [x] `audio_play_url(url)` — stream audio from a URL
- [x] `audio_pause()` / `audio_resume()` / `audio_stop()` — playback control
- [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–1.0)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update these bullets to match the shipped API.

Line 37 still documents audio_play(data, format) even though the SDK now exposes audio_play(data), and Line 40 still says volume is 0.0–1.0 while the host accepts up to 2.0. Keeping the roadmap aligned with the public API will avoid misleading app authors.

📝 Suggested doc fix
-- [x] `audio_play(data, format)` — decode and play audio buffers (MP3, OGG, WAV, FLAC)
+- [x] `audio_play(data)` — decode and play audio buffers (MP3, OGG, WAV, FLAC)

-- [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–1.0)
+- [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–2.0)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- [x] `audio_play(data, format)` — decode and play audio buffers (MP3, OGG, WAV, FLAC)
- [x] `audio_play_url(url)` — stream audio from a URL
- [x] `audio_pause()` / `audio_resume()` / `audio_stop()` — playback control
- [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–1.0)
- [x] `audio_play(data)` — decode and play audio buffers (MP3, OGG, WAV, FLAC)
- [x] `audio_play_url(url)` — stream audio from a URL
- [x] `audio_pause()` / `audio_resume()` / `audio_stop()` — playback control
- [x] `audio_set_volume(level)` / `audio_get_volume()` — volume control (0.0–2.0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ROADMAP.md` around lines 37 - 40, Update the roadmap bullets to reflect the
shipped API: change the function signature `audio_play(data, format)` to
`audio_play(data)` (remove the `format` parameter), and update the volume range
in `audio_set_volume(level)` / `audio_get_volume()` from "0.0–1.0" to "0.0–2.0";
keep the other entries (`audio_play_url(url)`,
`audio_pause()`/`audio_resume()`/`audio_stop()`) as-is so the list matches the
public SDK surface.

niklabh added 3 commits March 24, 2026 21:25
Polish the audio API with three remaining Phase 1 audio features:

Duration tracking:
- Capture total_duration() from the rodio Decoder during playback
- Expose via api_audio_duration() host function (returns ms, 0 if unknown)

Loop playback:
- Add per-channel looping flag toggled via api_audio_set_loop()
- When enabled, sources are wrapped in repeat_infinite() before
  appending to the player, looping until explicitly stopped

Multi-channel audio:
- Restructure AudioEngine from a single Player to a HashMap<u32, AudioChannel>
  where each channel has its own Player, duration, and loop state
- All existing api_audio_* functions continue to operate on channel 0
- New channel-specific functions for simultaneous playback:
  api_audio_channel_play     — play audio on any channel (0+)
  api_audio_channel_stop     — stop a specific channel
  api_audio_channel_set_volume — per-channel volume control
- Multiple channels mix to the same device via rodio's mixer

SDK: add audio_duration(), audio_set_loop(), audio_channel_play(),
audio_channel_stop(), audio_channel_set_volume() wrappers

Example: update audio-player demo with loop checkbox, duration display,
and SFX channel section (blip/beep/chirp sounds that layer over main audio)

Made-with: Cursor
@niklabh niklabh merged commit cbbdee7 into main Mar 24, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant