Skip to content

Add echo cancellation to loopback audio capture#27

Open
paterkleomenis wants to merge 3 commits intodesktop-app:masterfrom
paterkleomenis:loopback-echo-cancellation
Open

Add echo cancellation to loopback audio capture#27
paterkleomenis wants to merge 3 commits intodesktop-app:masterfrom
paterkleomenis:loopback-echo-cancellation

Conversation

@paterkleomenis
Copy link
Contributor

@paterkleomenis paterkleomenis commented Mar 4, 2026

⚠️ Depends on #24 (which depends on #28) — must be merged in order: #28#24#27

When sharing system audio in stereo, playback leaks back through the capture path and remote participants hear an echo. This adds a WebRTC AudioProcessing (AEC) stage to the OpenAL capture loop inside MixingAudioDeviceModule.

Changes

  • Create an AudioProcessing instance with echo_canceller enabled (mobile mode) when the capture device opens in stereo.
  • Feed far-end (rendered) audio into ProcessReverseAudioFrame via LoopbackCaptureTakeFarEndLinux, which returns the estimated delay.
  • Run the captured microphone + loopback mix through ProcessAudioFrame before pushing samples to the collector.
  • Track capture activity with SetLoopbackCaptureActiveLinux so the far-end buffer is only maintained while capture is running.

Falls back to the existing non-AEC path when stereo capture is not available or no far-end reference frame is ready.

@ilya-fedin
Copy link
Contributor

The PR has conflicts and commits from other PRs

@paterkleomenis paterkleomenis force-pushed the loopback-echo-cancellation branch from 0038528 to d3d14ef Compare March 5, 2026 11:41
@ilya-fedin
Copy link
Contributor

The PR still has commits from other PRs, please fix

@paterkleomenis paterkleomenis force-pushed the loopback-echo-cancellation branch from d3d14ef to d906c1c Compare March 5, 2026 11:55
@ilya-fedin
Copy link
Contributor

Did you squash this one too? 😭

Introduces infrastructure to mix system-audio loopback into the outgoing
microphone stream during a call, on both Linux and Windows.

New types (in webrtc_create_adm.cpp, anonymous namespace):

  LoopbackCollector
    Thread-safe mono ring buffer (max 2 s at 48 kHz). Loopback capture
    threads call pushSamples(); MixingAudioTransport calls readAndMix()
    to saturating-add loopback audio into the mic frame.

  DirectLoopbackCapture  (Linux only, guarded by WEBRTC_LINUX)
    Background std::thread that opens the PulseAudio monitor source via
    alcCaptureOpenDevice (stereo preferred, mono fallback), feeds decoded
    frames into LoopbackCollector, and stops cleanly on destruction.
    findMonitorDevice() prefers the monitor that matches the current
    default playback sink.

  MixingAudioTransport
    webrtc::AudioTransport decorator. When mixing is enabled it copies the
    mic buffer, calls LoopbackCollector::readAndMix, then forwards the
    blended frame to the inner transport. Disabled path is zero-overhead.

  MixingAudioDeviceModule  (details namespace)
    webrtc::AudioDeviceModule wrapper. Installs MixingAudioTransport in
    RegisterAudioCallback(), starts/stops DirectLoopbackCapture (Linux) or
    a dedicated loopback ADM + LoopbackAdmTransport (Windows) when
    setLoopbackEnabled() is toggled via MixingAudioControl.

New public API (webrtc_create_adm.h):

  MixingAudioControl
    Shared handle that survives ADM recreation. setLoopbackEnabled() /
    loopbackEnabled() let callers toggle mixing at any time; the control
    re-applies pending state whenever a new MixingAudioDeviceModule
    attaches itself.

  MixingAudioDeviceModuleCreator(innerCreator, control)
    Returns a creator lambda that wraps any ADM (e.g. the OpenAL ADM)
    inside a MixingAudioDeviceModule.
…odule

When screen audio is being shared in a call, muting the microphone should
not kill the system-audio stream. The two new runtime controls address this:

  setMicrophoneMuted(bool)
    Zeroes out the mic samples inside MixingAudioTransport::RecordedDataIsAvailable
    before they reach WebRTC, while keeping the audio channel open so that
    loopback (system audio) continues to flow through unchanged.

  setPlaybackVolume(float)
    Scales every decoded playback sample by the given factor (0.0-1.0) in
    MixingAudioTransport::NeedMorePlayData, providing per-call software volume
    control independent of the system mixer.

Both controls are exposed on the shared MixingAudioControl handle with their
respective atomic fields, so changes take effect immediately without any
lock held in the hot path.

State is persisted in MixingAudioControl and re-applied whenever a new
MixingAudioDeviceModule attaches (e.g. after an ADM restart) or whenever
RegisterAudioCallback() installs a fresh MixingAudioTransport.
When sharing system audio in stereo, playback leaks back through the
capture path and remote participants hear an echo.  This adds a
WebRTC AudioProcessing (AEC) stage to the OpenAL capture loop inside
MixingAudioDeviceModule:

  - Create an AudioProcessing instance with echo_canceller enabled
    (mobile_mode) when the capture device opens in stereo.
  - Feed far-end (rendered) audio into ProcessReverseAudioFrame via
    LoopbackCaptureTakeFarEndLinux, which returns the estimated delay.
  - Run the captured microphone + loopback mix through
    ProcessAudioFrame before pushing samples to the collector.
  - Track capture activity with SetLoopbackCaptureActiveLinux so the
    far-end buffer is only maintained while capture is running.

Falls back to the existing non-AEC path when stereo capture is not
available or no far-end reference frame is ready.
@paterkleomenis paterkleomenis force-pushed the loopback-echo-cancellation branch from a7e9ed7 to 471f0c8 Compare March 5, 2026 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants