Bug 225920 - Update Base64 encoding/decoding to match more modern WebKit conventions
Summary: Update Base64 encoding/decoding to match more modern WebKit conventions
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Sam Weinig
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-05-18 09:00 PDT by Sam Weinig
Modified: 2021-05-21 16:16 PDT (History)
24 users (show)

See Also:


Attachments
Patch (162.20 KB, patch)
2021-05-18 16:21 PDT, Sam Weinig
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (167.14 KB, patch)
2021-05-18 16:37 PDT, Sam Weinig
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (183.52 KB, patch)
2021-05-18 17:21 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (185.86 KB, patch)
2021-05-18 17:41 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (185.89 KB, patch)
2021-05-18 17:42 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (185.76 KB, patch)
2021-05-18 18:33 PDT, Sam Weinig
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (185.79 KB, patch)
2021-05-18 18:57 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (186.88 KB, patch)
2021-05-18 19:54 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (198.43 KB, patch)
2021-05-18 21:33 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (198.43 KB, patch)
2021-05-19 07:13 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (201.21 KB, patch)
2021-05-19 10:14 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff
Patch (201.21 KB, patch)
2021-05-19 10:14 PDT, Sam Weinig
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (201.02 KB, patch)
2021-05-19 10:24 PDT, Sam Weinig
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (201.05 KB, patch)
2021-05-19 10:41 PDT, Sam Weinig
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Weinig 2021-05-18 09:00:20 PDT
Update Base64 encoding/decoding to match more modern WebKit conventions (fewer out values, string concatenation adaptor, etc.)
Comment 1 Sam Weinig 2021-05-18 16:21:02 PDT Comment hidden (obsolete)
Comment 2 Sam Weinig 2021-05-18 16:37:50 PDT Comment hidden (obsolete)
Comment 3 Sam Weinig 2021-05-18 17:21:40 PDT Comment hidden (obsolete)
Comment 4 Sam Weinig 2021-05-18 17:41:10 PDT Comment hidden (obsolete)
Comment 5 Sam Weinig 2021-05-18 17:42:43 PDT Comment hidden (obsolete)
Comment 6 Sam Weinig 2021-05-18 18:33:15 PDT Comment hidden (obsolete)
Comment 7 Sam Weinig 2021-05-18 18:57:01 PDT Comment hidden (obsolete)
Comment 8 Sam Weinig 2021-05-18 19:54:27 PDT Comment hidden (obsolete)
Comment 9 Sam Weinig 2021-05-18 21:33:16 PDT Comment hidden (obsolete)
Comment 10 Sam Weinig 2021-05-18 21:35:37 PDT
There is a lot more cleanup that can be done, but this is a good start.

The main takeaway from this is that we need to make our binary data representations consistent. Our inconsistent use of both const char* and const uint8_t* is a nuisance.

We should probably converge on const uint8_t* and just rid the remaining const char* away (although maybe we want to try to use std::byte instead?)
Comment 11 Sam Weinig 2021-05-19 07:13:09 PDT
Created attachment 429046 [details]
Patch
Comment 12 Chris Dumez 2021-05-19 07:59:42 PDT
Comment on attachment 429046 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=429046&action=review

I did not r+ yet because I dislike all the new makeString(base64encoded()). It makes the code more verbose than it used to be, which is unfortunate.

> Source/WTF/ChangeLog:8
> +        Bring base64 encoding/decoding up to more modern WebKit convetions by:

typo: convetions

> Source/WTF/wtf/text/Base64.cpp:34
> +static constexpr unsigned encodeMapSize = 64;

I believe we don't need the static.

> Source/WTF/wtf/text/Base64.cpp:35
> +static constexpr unsigned decodeMapSize = 128;

ditto.

> Source/WTF/wtf/text/Base64.cpp:118
> +            destinationDataBuffer[didx++] = encodeMap[ (inputDataBuffer[sidx    ] >> 2) & 077];

I thought we had stopped adding spacing for alignment like that in new code? I am personally not a fan as it's a pain to maintain and they eventually get misaligned.

> Source/WTF/wtf/text/Base64.cpp:141
> +        destinationDataBuffer[didx] = '=';

destinationDataBuffer[didx++] = '=';

> Source/WTF/wtf/text/Base64.cpp:142
> +        ++didx;

Then we don't need this line.

Or we could use a for loop:
for (; didx < destinationLength; ++didx)
  destinationDataBuffer[didx] = '=';

> Source/WebCore/Modules/plugins/QuickTimePluginReplacement.mm:249
> +        data64 = makeString("data:", mimeType, ";base64,", base64Encoded([data bytes], [data length]));

Is this why base64Encoded() doesn't return a String. Is is more efficient in this case? I am assuming there is a reason base64Encoded() doesn't simply return a String (which would be much nicer to use) considering most call sites need a String.
Comment 13 Sam Weinig 2021-05-19 09:25:00 PDT
(In reply to Chris Dumez from comment #12)
> Comment on attachment 429046 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=429046&action=review
> 
> I did not r+ yet because I dislike all the new makeString(base64encoded()).
> It makes the code more verbose than it used to be, which is unfortunate.

Fair enough. I thought that might be the case for some. The other option here is:

Rename base64Encode() (which returns a vector) to base64EncodeToVector() or base64EncodeToData(), and add base64EncodeToString() (or base64Encode() could just return a String()?)


> 
> > Source/WTF/ChangeLog:8
> > +        Bring base64 encoding/decoding up to more modern WebKit convetions by:
> 
> typo: convetions
> 
> > Source/WTF/wtf/text/Base64.cpp:34
> > +static constexpr unsigned encodeMapSize = 64;
> 
> I believe we don't need the static.

You are right. Will fix.

> > Source/WTF/wtf/text/Base64.cpp:118
> > +            destinationDataBuffer[didx++] = encodeMap[ (inputDataBuffer[sidx    ] >> 2) & 077];
> 
> I thought we had stopped adding spacing for alignment like that in new code?
> I am personally not a fan as it's a pain to maintain and they eventually get
> misaligned.

It made reading and understanding this code so much easier for me. I'm not really sure I understand the argument for hard to maintain.

> 
> > Source/WTF/wtf/text/Base64.cpp:141
> > +        destinationDataBuffer[didx] = '=';
> 
> destinationDataBuffer[didx++] = '=';
> 
> > Source/WTF/wtf/text/Base64.cpp:142
> > +        ++didx;
> 
> Then we don't need this line.

Ok.

> 
> Or we could use a for loop:
> for (; didx < destinationLength; ++didx)
>   destinationDataBuffer[didx] = '=';
> 
> > Source/WebCore/Modules/plugins/QuickTimePluginReplacement.mm:249
> > +        data64 = makeString("data:", mimeType, ";base64,", base64Encoded([data bytes], [data length]));
> 
> Is this why base64Encoded() doesn't return a String. Is is more efficient in
> this case? I am assuming there is a reason base64Encoded() doesn't simply
> return a String (which would be much nicer to use) considering most call
> sites need a String.

Yes, it avoids an extra allocation and copy
Comment 14 Sam Weinig 2021-05-19 10:14:12 PDT
Created attachment 429060 [details]
Patch
Comment 15 Sam Weinig 2021-05-19 10:14:53 PDT
Created attachment 429061 [details]
Patch
Comment 16 Sam Weinig 2021-05-19 10:24:24 PDT
Created attachment 429064 [details]
Patch
Comment 17 Chris Dumez 2021-05-19 10:32:19 PDT
Comment on attachment 429064 [details]
Patch

r=me
Comment 18 Sam Weinig 2021-05-19 10:41:02 PDT
Created attachment 429068 [details]
Patch
Comment 19 EWS 2021-05-19 11:43:38 PDT
Committed r277740 (237912@main): <https://commits.webkit.org/237912@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 429068 [details].
Comment 20 Radar WebKit Bug Importer 2021-05-19 11:44:23 PDT
<rdar://problem/78216131>
Comment 21 Darin Adler 2021-05-19 12:31:39 PDT
Comment on attachment 429068 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=429068&action=review

> Source/WTF/wtf/text/Base64.h:77
> +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const String&, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(StringView, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const Vector<uint8_t>&, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const Vector<char>&, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const uint8_t*, unsigned, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const char*, unsigned, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);

Now I suggest we remove the overloads for const String& and const char* since both of those convert into StringView efficiently. Unless there is some kind of problem with ambiguity at call sites?
Comment 22 Sam Weinig 2021-05-19 14:04:00 PDT
(In reply to Darin Adler from comment #21)
> Comment on attachment 429068 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=429068&action=review
> 
> > Source/WTF/wtf/text/Base64.h:77
> > +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const String&, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> > +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(StringView, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> > +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const Vector<uint8_t>&, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> > +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const Vector<char>&, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> > +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const uint8_t*, unsigned, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> > +WTF_EXPORT_PRIVATE Optional<Vector<uint8_t>> base64Decode(const char*, unsigned, OptionSet<Base64DecodeOptions> = { }, Base64DecodeMap = Base64DecodeMap::Default);
> 
> Now I suggest we remove the overloads for const String& and const char*
> since both of those convert into StringView efficiently. Unless there is
> some kind of problem with ambiguity at call sites?

I doubt there will be an issue, just didn't occur to me. Will do.