Bug 175568

Summary: data: URL base64 handling different from atob()
Product: WebKit Reporter: Anne van Kesteren <annevk>
Component: DOMAssignee: Anne van Kesteren <annevk>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, ap, beidson, cdumez, dbates, d, koivisto, mathias, rbuis, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=148128
https://bugs.webkit.org/show_bug.cgi?id=261524
Bug Depends on:    
Bug Blocks: 255442    

Description Anne van Kesteren 2017-08-15 03:50:37 PDT
Tests: https://github.com/w3c/web-platform-tests/pull/6890

Eventual standard: https://github.com/whatwg/fetch/pull/579

It seems better if data: URLs and atob() are aligned.
Comment 1 Alexey Proskuryakov 2017-08-15 13:33:19 PDT
Maybe, but don't they explicitly implement different specs (base64 vs. base64url)?

See also https://github.com/whatwg/html/issues/351
Comment 2 Anne van Kesteren 2017-08-16 00:05:04 PDT
They don't, but it does seem like that's how Safari implements data: URLs, contrary to other browsers.

That's why data:;base64,___ decodes in Safari and not in Chrome and Firefox.

I'm surprised this hasn't caused you more compatibility issues. I'll make sure to add some additional tests for this.
Comment 3 Chris Dumez 2017-08-16 15:17:23 PDT
Looking at DataURLDecoder, I see the following logic:
    // First try base64url.
    if (!base64URLDecode(task.encodedData.toStringWithoutCopying(), buffer)) {
        // Didn't work, try unescaping and decoding as base64.
        auto unescapedString = decodeURLEscapeSequences(task.encodedData.toStringWithoutCopying());
        if (!base64Decode(unescapedString, buffer, Base64IgnoreSpacesAndNewLines))
            return;
    }
Comment 4 Chris Dumez 2017-08-16 15:23:56 PDT
(In reply to Chris Dumez from comment #3)
> Looking at DataURLDecoder, I see the following logic:
>     // First try base64url.
>     if (!base64URLDecode(task.encodedData.toStringWithoutCopying(), buffer))
> {
>         // Didn't work, try unescaping and decoding as base64.
>         auto unescapedString =
> decodeURLEscapeSequences(task.encodedData.toStringWithoutCopying());
>         if (!base64Decode(unescapedString, buffer,
> Base64IgnoreSpacesAndNewLines))
>             return;
>     }

This was introduced in Bug 148128 when WebKit started doing data URL decoding by itself. Before this, I believe we let CFNetwork decode those for us (No idea exactly how this behaved).
Comment 5 Antti Koivisto 2017-08-16 21:33:43 PDT
It is there to match the earlier CFNetwork behavior. I believe some tests fail if you remove it.
Comment 6 Anne van Kesteren 2023-04-13 00:24:54 PDT
Rob fixed the tests in https://github.com/WebKit/WebKit/commit/7b467a9d964d4214cfa29005135d429279af7d00 it seems, but I don't get the impression the code is shared.
Comment 7 Anne van Kesteren 2023-04-13 00:34:00 PDT
Okay, this in WebLoaderStrategy.cpp seems bad:

    auto mode = DataURLDecoder::Mode::Legacy;
    if (request.requester() == ResourceRequestRequester::Fetch)
        mode = DataURLDecoder::Mode::ForgivingBase64;

Removing DataURLDecoder::Mode::Legacy completely and seeing what breaks seems like a logical next step here.
Comment 8 Radar WebKit Bug Importer 2023-04-13 01:01:07 PDT
<rdar://problem/107982669>
Comment 9 Anne van Kesteren 2023-04-13 01:02:57 PDT
Pull request: https://github.com/WebKit/WebKit/pull/12693
Comment 10 EWS 2023-04-14 10:54:18 PDT
Committed 262976@main (2518c514c679): <https://commits.webkit.org/262976@main>

Reviewed commits have been landed. Closing PR #12693 and removing active labels.