NEW302253
AAC AudioEncoder produces incorrect decoder config description
https://bugs.webkit.org/show_bug.cgi?id=302253
Summary AAC AudioEncoder produces incorrect decoder config description
David
Reported 2025-11-10 05:25:33 PST
Repro (open in Safari): https://hilarious-souffle-80c4cb.netlify.app/ --- It appears to me that using an AudioEncoder to encode AAC emits incorrect "description" bytes in the decoder config. When I encode 48000Hz mono audio, this is the description it gives me: 3, 128, 128, 128, 34, 0, 0, 0, 4, 128, 128, 128, 20, 64, 20, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 128, 128, 128, 2, 17, 136, 6, 128, 128, 128, 1, 2 Per spec, these bytes have to be an AudioSpecificConfig as defined in ISO 14496-3 Section 1.6.2.1. However, when interpreting the bytes according to this spec, they make no sense. They parse as: Object type: 0 ("null", makes no sense) Frequency index: 7 (-> 22050 Hz sample rate, also wrong) Channel configuration: 0 (means "Defined in AOT Specifc Config", but this could just be "1" indicating mono) Chrome's description bytes are: 17, 136 This makes much more sense. Object type: 2 (AAC LC) Frequency index: 3 (-> 48000 Hz) Channel configuration: 1 (-> mono) So, it appears to me that WebKit is emitting gibberish here, albeit very deterministic gibberish. I'm sure it adheres to *some* format, but it definitely does not seem to adhere to ISO 14496-3 Section 1.6.2.1. It is to be noted that piping the encoded audio back into AudioDecoder successfully decodes the audio (even with the fauly description), but muxing the audio alongside the description into an MP4 file creates a silent audio track - one that FFmpeg classifies as: Stream #0:00x1: Audio: aac (mp4a / 0x6134706D), 22050 Hz, 0 channels, fltp, 118 kb/s (default)
Attachments
David
Comment 1 2025-11-10 06:01:33 PST
Update: The bytes appear to be the contents of an "esds" box which contains the AudioSpecificConfig. The relevant bytes are [17, 136] (8 bytes from the end) which are the actual description. However, this violates the WebCodecs spec: > If description is present, it is assumed to a AudioSpecificConfig as defined in [iso14496-3] section 1.6.2.1, Table 1.15, and the bitstream is assumed to be in aac.
Radar WebKit Bug Importer
Comment 2 2025-11-17 05:26:11 PST
Note You need to log in before you can comment on or make changes to this bug.