Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require AudioDecoderConfig.description for channels > 2 with Opus? #826

Open
tguilbert-google opened this issue Aug 20, 2024 · 14 comments
Open

Comments

@tguilbert-google
Copy link
Member

I'm surfacing this issue, originally reported against Chromium's implementation.

The following fails on Chromium but works on Firefox

const config = {codec: 'opus', numberOfChannels: 6, sampleRate: 44100};
console.log(await AudioDecoder.isConfigSupported(config)); // {supported: true}

const decoder = new AudioDecoder({ error: console.error, output: console.log });

decoder.configure(config);

If the channel count is greater than 2 we can either:

  • have isConfigSupported() return {supported: false} when AudioDecoderConfig.description isn't set
  • generate extradata behind the scenes

Is this worth calling out in the spec, or adding a non-normative note?

@padenot, wdyt?

@padenot
Copy link
Collaborator

padenot commented Aug 21, 2024

There's already a precedent for generating description, iirc, with AAC.

@tguilbert-google
Copy link
Member Author

Generating headers seems reasonable, with a caveat:

I'd support to generating headers for channel mapping families 0 and 1 (defined here), when there are 8 channels or fewer and we can make a good guess at the mapping. For more than 8 channels, the channel mapping family is 255 and the mappings are entirely arbitrary, and asking for a description might be better than guessing.

@aboba
Copy link
Collaborator

aboba commented Aug 26, 2024

Is there a way to support Opus channel mapping families 2 and 3, or would that always be done via channel mapping family 255?

Also, I am wondering if there are implications for other codecs where multichannel is supported via IAMF.

@tguilbert-google
Copy link
Member Author

Thanks Bernard, I am just learning about channel mappings 2 and 3. Do we currently support these families when we supply the description/extradata? I don't have test media to verify this with.

I've ran into the following issue trying to generate a description for channel mapping family of 1: at the time audioDecoder.configure() is called, we don't know whether the encoded streams will be "streams" or "coupled streams". E.g. the header for a 5.1 channels audio file could have anywhere in-between 6 mono streams and 3 stereo streams.

Does it still make sense to generate a header in this case?

@padenot
Copy link
Collaborator

padenot commented Aug 27, 2024

I think you're right, I'm not sure what we're doing then.

@padenot
Copy link
Collaborator

padenot commented Aug 27, 2024

Filed https://bugzilla.mozilla.org/show_bug.cgi?id=1915143 for the Gecko side.

@padenot
Copy link
Collaborator

padenot commented Sep 16, 2024

have isConfigSupported() return {supported: false} when AudioDecoderConfig.description isn't set

Just a note that this isn't supposed to return false: https://w3c.github.io/webcodecs/#valid-audiodecoderconfig says to only check if it's detached, and we know that to some degree we want to check the channel count and sample-rate.

The idea here was to allow checking if a codec is supported, with some parameters, but not to require having bits of the media to check for support. e.g. if I want to know if a UA implement Flac decoding with Web Codecs (Flac being a codec that requires a description), and in particular e.g. 192khz / 7.1, I shouldn't have to download bits of a Flac asset (= the description) to know that, and if it's not supported, I can immediately fall back to something else without having to hit the network. In case of invalid or missing description, codec instantiation will fail as expected.

In light of #826 (comment) (that we obviously ran into as well), Gecko is going to implement requiring a description for channels > 2, because as @tguilbert-google that's the only thing possible. Some WPTs are added to check this, what Gecko was doing was nonsensical.

@tguilbert-google
Copy link
Member Author

Those are good clarifications. I think Chrome's implementation needs to revisit some of its error messages, which suggest calling isConfigSupported() when configure() fail, which is confusing for web authors.

What about the following case:

  • Opus, channels >2, no description: isConfigSupported() returns true.
  • Opus, channels >2, description is provided and clearly invalid: isConfigSupported() return true?

The first case is justifiable to me, as we want to allow quick and easy checks and early fallbacks.
In the second case, shouldn't we return false instead, as we know that the config isn't supported? Or does support refer more to the "form" of the config than its "content"? We could update the spec to call out that we might check channel counts and sample-rates and other top level items during isConfigSupported(), but bad descriptions fail at configure() time.

@padenot
Copy link
Collaborator

padenot commented Sep 17, 2024

Or does support refer more to the "form" of the config than its "content"? We could update the spec to call out that we might check channel counts and sample-rates and other top level items during isConfigSupported(), but bad descriptions fail at configure() time.

I've always thought of it like what you describe. The line is a bit blurry, but generally if you need to instantiate a byte stream parser, you've crossed it.

@sandersdan
Copy link
Contributor

FWIW, Chrome is already parsing description for H.264 and H.265, and will report unsupported if that fails.

@padenot
Copy link
Collaborator

padenot commented Sep 18, 2024

There's a middle ground in which UA are allowed to perform a more in-depth check. I think this is reasonable, and useful for authors.

@JonnyBurger
Copy link

From the spec:

If a description has been set, the bitstream is assumed to be in ogg format.
If a description has not been set, the bitstream is assumed to be in opus format.

What is one supposed to do if they have a opus bitstream with > 2 channels?
Wondering if the spec had this in mind, but also what the current implementations do.

@padenot
Copy link
Collaborator

padenot commented Sep 19, 2024

It doesn't exist. Opus is mono or stereo, and then you combine mono or stereo streams per https://datatracker.ietf.org/doc/html/rfc7845#section-5.1 and/or https://datatracker.ietf.org/doc/rfc8486/ to have a stream with more than 2 channels.

@JonnyBurger
Copy link

JonnyBurger commented Sep 19, 2024

Thanks for the explanation!

I understand now that the audio tracks tagged A_OPUS in WebM files are actually in OGG bitstream format (recognizable because of the presence of OpusHead). So this makes sense now to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants