NEW299541
Unexpected DecompressionStream behaviour (arbitrary-padded data)
https://bugs.webkit.org/show_bug.cgi?id=299541
Summary Unexpected DecompressionStream behaviour (arbitrary-padded data)
kdarutkin
Reported 2025-09-25 12:23:35 PDT
Created attachment 476857 [details] reproduction script (console-friendly) Hi, When working with DecompressionStream API I noticed a major difference between Safari, Chrome and Firefox. If you pass compressed binary data with arbitrary padding, the compression stream fails, however: * In Chrome you can actually read chunks of decompressed data before encountering an error: `TypeError: Junk found after end of compressed data.` * In Safari and Firefox behaviour depends on chunking and, in case the last chunk has some extra bytes, output is discarded and an error is triggered: `TypeError: Extra bytes past the end.` (Safari) and `TypeError: Unexpected input after the end of stream` (Firefox) -------- After checking the spec https://compression.spec.whatwg.org/#dom-decompressionstream-decompressionstream both behaviours are spec-compliant, since: * The spec requires that trailing data after the end of a compressed stream is an error for all three formats. * During per-chunk processing: decompress and enqueue a chunk says "Let buffer be the result of decompressing… If this results in an error, then throw a TypeError." That means an implementation that notices the trailing bytes while handling a single incoming chunk can throw immediately, before enqueuing anything. * At the end (on close): decompress flush and enqueue says "If the end of the compressed input has not been reached, then throw a TypeError." If earlier chunks already produced output and only the final validation fails, an implementation may have already enqueued some decompressed bytes before the final error is thrown. The spec does not mandate detecting the trailing-data error as early as possible versus only on flush, nor does it require suppressing already-enqueued output if a later error occurs. It only mandates that trailing data is an error. So Chrome’s "emit some output, then throw on close" and Safari/Firefox’s "throw without emitting" are both consistent with the current algorithms. -------- An actual problem comes when you start passing chunked input data to the decompression stream. Let's say you feed the stream with two chunks: first one is valid and the second one is partially valid and has arbitrary padding. In this case: * The same JS code will produce different results in Chrome and Safari/Firefox * JS-based alternatives such as `pako` or `fflate` handles such cases without any problems * There's no way to tell if compressed stream is valid or padded before passing the whole payload through the `DecompressionStream` and so far the only reliable option is to try `DecompressionStream` and gracefully fallback to JS-based libraries * Actual wording in the `TypeError` is vendor-specific and not defined by spec (therefore might change in the future), so checking if the compressed data is invalid/corrupted or just padded with arbitrary data is tricky The reason I'm filing the ticket is because I believe Chrome's approach is more robust and intuitive and I would love to see the same behaviour in Safari P.S. attached a reproduction script
Attachments
reproduction script (console-friendly) (1.99 KB, text/javascript)
2025-09-25 12:23 PDT, kdarutkin
no flags
kdarutkin
Comment 1 2025-09-25 14:48:08 PDT
Sam Sneddon [:gsnedders]
Comment 2 2025-09-26 14:41:56 PDT
*** Bug 299542 has been marked as a duplicate of this bug. ***
kdarutkin
Comment 3 2025-10-01 13:21:35 PDT
JFYI whatwg spec was updated to cover this issue https://github.com/whatwg/compression/pull/77 and WPT tests are pending https://github.com/web-platform-tests/wpt/pull/55118
Radar WebKit Bug Importer
Comment 4 2025-10-02 12:25:30 PDT
Note You need to log in before you can comment on or make changes to this bug.