WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
174816
[GTK][WPE] Need a function to convert internal URI to display ("pretty") URI
https://bugs.webkit.org/show_bug.cgi?id=174816
Summary
[GTK][WPE] Need a function to convert internal URI to display ("pretty") URI
Michael Catanzaro
Reported
2017-07-25 05:37:59 PDT
WebKitWebView should have a "pretty" URI property. I don't know what we should name it; perhaps presentable-uri or decoded-uri or something along those lines. This URI would be for use in user-visible locations. It should percent-decode the URI, so it will contain UTF-8 characters instead of ASCII. It should also perform IDN decoding. Currently, Epiphany performs both tasks itself, which is not good because most applications really ought to be doing the same thing, using the same code, and it is a significant amount of code. See ephy_uri_decode() in ephy-uri-helpers.c: it's about 40 lines of code to convert WebKitWebView:uri to be suitable for displaying to the user. That's not good! If we add this property, we would need to include a defense against IDN homograph attacks. Epiphany does not currently have such a defense. So some research on IDN homograph defenses will be needed. I like the Firefox approach better than Chrome's approach. (Chrome completely bans characters that look like Latin characters, essentially banning Cyrillic URIs entirely. I'd rather have lower security than harm Cyrillic users like that.) Finally, in Epiphany, we should see if we can remove ephy_uri_decode() (and maybe also ephy_uri_normalize()?) and use this new property instead. I won't be working on this right away. Just reporting it for now!
Attachments
Patch
(7.12 KB, patch)
2018-01-07 10:22 PST
,
Gabriel Ivașcu
no flags
Details
Formatted Diff
Diff
Patch
(21.72 KB, patch)
2018-01-08 04:59 PST
,
Gabriel Ivașcu
no flags
Details
Formatted Diff
Diff
Patch
(59.15 KB, patch)
2018-01-15 08:58 PST
,
Gabriel Ivașcu
no flags
Details
Formatted Diff
Diff
Patch
(59.70 KB, patch)
2018-01-15 09:09 PST
,
Gabriel Ivașcu
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews101 for mac-sierra
(2.28 MB, application/zip)
2018-01-15 10:11 PST
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews106 for mac-sierra-wk2
(2.76 MB, application/zip)
2018-01-15 10:17 PST
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews114 for mac-sierra
(2.94 MB, application/zip)
2018-01-15 10:42 PST
,
EWS Watchlist
no flags
Details
Patch
(61.60 KB, patch)
2018-01-20 03:47 PST
,
Gabriel Ivașcu
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews102 for mac-sierra
(2.21 MB, application/zip)
2018-01-20 04:47 PST
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews106 for mac-sierra-wk2
(2.59 MB, application/zip)
2018-01-20 04:54 PST
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews114 for mac-sierra
(2.97 MB, application/zip)
2018-01-20 05:13 PST
,
EWS Watchlist
no flags
Details
Patch
(60.75 KB, patch)
2018-01-27 10:50 PST
,
Gabriel Ivașcu
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews103 for mac-sierra
(2.22 MB, application/zip)
2018-01-27 11:51 PST
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews104 for mac-sierra-wk2
(2.55 MB, application/zip)
2018-01-27 11:56 PST
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews114 for mac-sierra
(2.94 MB, application/zip)
2018-01-27 12:17 PST
,
EWS Watchlist
no flags
Details
Patch
(63.51 KB, patch)
2018-06-12 09:03 PDT
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(68.21 KB, patch)
2018-06-19 08:37 PDT
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(68.24 KB, patch)
2018-06-19 09:04 PDT
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews126 for ios-simulator-wk2
(2.24 MB, application/zip)
2018-06-19 10:50 PDT
,
EWS Watchlist
no flags
Details
Patch
(69.09 KB, patch)
2018-06-28 06:07 PDT
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(71.22 KB, patch)
2018-07-04 09:15 PDT
,
Ms2ger (he/him; ⌚ UTC+1/+2)
cgarcia
: review-
ews-watchlist
: commit-queue-
Details
Formatted Diff
Diff
Archive of layout-test-results from ews125 for ios-simulator-wk2
(2.24 MB, application/zip)
2018-07-04 11:03 PDT
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews201 for win-future
(12.76 MB, application/zip)
2018-07-04 13:23 PDT
,
EWS Watchlist
no flags
Details
Patch
(114.10 KB, patch)
2018-12-07 05:14 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
mcatanzaro
: review-
ews-watchlist
: commit-queue-
Details
Formatted Diff
Diff
Archive of layout-test-results from ews122 for ios-simulator-wk2
(9.72 MB, application/zip)
2018-12-07 07:09 PST
,
EWS Watchlist
no flags
Details
Patch
(108.73 KB, patch)
2018-12-11 01:41 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(109.44 KB, patch)
2018-12-14 08:32 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(109.52 KB, patch)
2018-12-18 01:04 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(109.52 KB, patch)
2018-12-18 01:12 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(112.49 KB, patch)
2019-01-28 09:00 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(112.48 KB, patch)
2019-01-29 00:51 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(112.36 KB, patch)
2019-01-29 08:55 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(112.35 KB, patch)
2019-01-30 00:52 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
no flags
Details
Formatted Diff
Diff
Patch
(112.42 KB, patch)
2019-01-30 03:50 PST
,
Ms2ger (he/him; ⌚ UTC+1/+2)
mcatanzaro
: review+
mcatanzaro
: commit-queue-
Details
Formatted Diff
Diff
Show Obsolete
(33)
View All
Add attachment
proposed patch, testcase, etc.
Michael Catanzaro
Comment 1
2017-07-25 05:45:50 PDT
Also, ephy_uri_decode() currently contains a mutex, because there's no convenient location to store the ICU UIDNA context so it needs to be created inside the helper function. That's not good as it entails an unnecessary locking cost. It's also overkill, because the function is probably only ever used from the UI thread, but we don't want helper functions modifying global state like that unless they are threadsafe as that'd be quite a footgun. So we can't do better in Epiphany. We can do better for WebKit. The UIDNA context should become a private member of WebKitWebView, so it would be created once on startup and no mutex would be needed at all.
Adrian Perez
Comment 2
2017-07-25 12:49:36 PDT
(In reply to Michael Catanzaro from
comment #1
)
> Also, ephy_uri_decode() currently contains a mutex [...] > > We can do better for WebKit. The UIDNA context should become a private > member of WebKitWebView, so it would be created once on startup and no mutex > would be needed at all.
Couldn't we do _even_ better? It looks to me that the UIDNA context could be a global singleton, AFAIU it does not need to be associated with a WebKitWebView.
Michael Catanzaro
Comment 3
2017-07-25 13:48:58 PDT
If it consumes a substantial amount of memory (I suppose it might), then we could do that. It would be an error to access this property from any thread other than the UI thread, so there is no harm in that. I generally prefer to avoid global singletons since they've caused us so much trouble in the past, but if a UIDNA context requires a substantial amount of memory then we'd better not create a new one for each web view.
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 4
2017-10-23 06:53:53 PDT
Looking into this, I'm not sure if solving this just for WebKitWebView's uri makes sense. ephy_uri_decode is also used with urls returned from webkit_download_get_destination and webkit_hit_test_result_get_link_uri, for example. Perhaps exposing something like ephy_uri_decode directly would be better?
Michael Catanzaro
Comment 5
2017-10-23 07:10:09 PDT
Hm. First of all: good point. One problem with that is that we do not currently have any "free functions" -- functions that are not methods of objects -- exposed in our API. It would be a significant change for us to expose a function with no associated object. And yet, you're right, that seems to be required here. What we really want is for the functionality of ephy_uri_decode() to move into WebKit (with some added defenses against IDN homograph attacks). I would ask Carlos Garcia if he has an API recommendation.
Carlos Garcia Campos
Comment 6
2017-10-23 23:35:54 PDT
I don't see any problem in adding global functions not attached to any particular object. We had some of those in WebKit1 API, under a globals section in the docs (not sure it's a good idea to keep using globals, but still).
https://webkitgtk.org/reference/webkitgtk/stable/webkitgtk-Global-functions.html
Adrian Perez
Comment 7
2017-10-24 00:17:40 PDT
(In reply to Carlos Garcia Campos from
comment #6
)
> I don't see any problem in adding global functions not attached to any > particular object. We had some of those in WebKit1 API, under a globals > section in the docs (not sure it's a good idea to keep using globals, but > still). > >
https://webkitgtk.org/reference/webkitgtk/stable/webkitgtk-Global-functions
. > html
I don't see any problem either in having functions which are “pure” (only read their arguments, produce a result which only depends on the input) and don't need state (= an instance). From a functional POV one could argue that there are the best kind of functions ;-)
Gabriel Ivașcu
Comment 8
2017-11-08 04:25:38 PST
(In reply to Michael Catanzaro from
comment #0
)
> If we add this property, we would need to include a defense against IDN > homograph attacks. Epiphany does not currently have such a defense. So some > research on IDN homograph defenses will be needed. I like the Firefox > approach better than Chrome's approach. (Chrome completely bans characters > that look like Latin characters, essentially banning Cyrillic URIs entirely. > I'd rather have lower security than harm Cyrillic users like that.)
I did a bit research and if you wish to follow Mozilla's way, I think there are two approaches for this problem: (1) Allow strings containing characters from the same language script only, e.g. all chars are Greek, all chars are Cyrillic, etc. This is easier to implement, but is more restrictive since it "blocks" non-Latin domains with characters from multiple scripts. (2) Allow strings containing characters from the same language script and characters from some allowed combinations of scripts. This is what Mozilla does actually (
https://wiki.mozilla.org/IDN_Display_Algorithm#Algorithm
). It's more permissive than (1), but a bit harder to implement. I'm willing to work on either (1) or (2). We can implement it in Epiphany first, and then move it to WebKit along with the code in ephy_uri_decode().
Michael Catanzaro
Comment 9
2017-11-08 07:39:42 PST
I think (2) would be ideal, if there's a code comment linking to the Mozilla page. But (1) would be OK if (2) is difficult. Agreed that implementing it in Epiphany first would be best.
Gabriel Ivașcu
Comment 10
2018-01-06 03:36:34 PST
Sorry for the late reply. Do we have a resolution on whether this should be a "global" function rather than a method of WebKitWebView?
Michael Catanzaro
Comment 11
2018-01-06 09:08:46 PST
I failed to get a recommendation from Carlos Garcia as to where he wants it, so I would just add a new file (WebKitURIUtilities.h) and a new section in the documentation "URI Utilities" for them. We can always reorganize those later; the exact layout of the headers is not API.
Gabriel Ivașcu
Comment 12
2018-01-07 10:22:25 PST
Created
attachment 330658
[details]
Patch
Gabriel Ivașcu
Comment 13
2018-01-07 10:52:37 PST
(In reply to Gabriel Ivașcu from
comment #12
)
> Created
attachment 330658
[details]
> Patch
OK, seems that the new files were not added to the patch. Is there any way to tell the webkit-patch script to add the new files too to the patch?
Michael Catanzaro
Comment 14
2018-01-07 14:28:24 PST
(In reply to Gabriel Ivașcu from
comment #13
)
> OK, seems that the new files were not added to the patch. Is there any way > to tell the webkit-patch script to add the new files too to the patch?
Use 'git add' (only for the new files) Yes, the workflow is not very good....
Carlos Alberto Lopez Perez
Comment 15
2018-01-08 04:01:54 PST
(In reply to Gabriel Ivașcu from
comment #13
)
> (In reply to Gabriel Ivașcu from
comment #12
) > > Created
attachment 330658
[details]
> > Patch > > OK, seems that the new files were not added to the patch. Is there any way > to tell the webkit-patch script to add the new files too to the patch?
I usually create a new branch, commit the (going to be uploaded) patch locally and then pass to webkit-patch the sha1 hash of the local commit_id I want to upload with the flag "-g"
Gabriel Ivașcu
Comment 16
2018-01-08 04:59:28 PST
Created
attachment 330687
[details]
Patch
Carlos Garcia Campos
Comment 17
2018-01-08 05:25:20 PST
Comment on
attachment 330687
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=330687&action=review
Isn't this what WebCore::userVisibleString() does? Can't we implement that in WebCore instead? or even make the current objc implementation cross-platform?
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:47 > +static const UIDNA* internationalDomainNameTranscoder() > +{ > + static UIDNA* transcoder; > + static std::once_flag onceFlag; > + std::call_once(onceFlag, [] { > + UErrorCode error = U_ZERO_ERROR; > + // These flags should be synced with URLParser::internationalDomainNameTranscoder() in URLParser.cpp. > + transcoder = uidna_openUTS46(UIDNA_CHECK_BIDI | UIDNA_CHECK_CONTEXTJ | UIDNA_NONTRANSITIONAL_TO_UNICODE | UIDNA_NONTRANSITIONAL_TO_ASCII, &error); > + RELEASE_ASSERT(U_SUCCESS(error)); > + RELEASE_ASSERT(transcoder); > + }); > + return transcoder; > +}
Why are we duplicating this? Can't we just use URLParser::internationalDomainNameTranscoder()?
Michael Catanzaro
Comment 18
2018-01-08 05:48:04 PST
(In reply to Carlos Garcia Campos from
comment #17
)
> Isn't this what WebCore::userVisibleString() does? Can't we implement that > in WebCore instead? or even make the current objc implementation > cross-platform?
I didn't know about this. WebCore::userVisibleString only exists in Cocoa ports. It's declared in the Objective C++ header WebCoreNSURLExtras.h. It converts an NSURL* to an NSString*. We can certainly create a new function with the same name, converting a const char* to a char* instead, and move it to the platform layer. Then it could be used in the pasteboard code, same as on Mac, if we ever decide that it'd be desirable to copy the prettified URL to the clipboard. But I'm not sure that's actually what we want to do with the pasteboard, and we can always move it down from the API layer to the platform layer in the future if we do decide to do that. So I think it's probably better where it is now.
> > Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:47 > > +static const UIDNA* internationalDomainNameTranscoder() > > +{ > > + static UIDNA* transcoder; > > + static std::once_flag onceFlag; > > + std::call_once(onceFlag, [] { > > + UErrorCode error = U_ZERO_ERROR; > > + // These flags should be synced with URLParser::internationalDomainNameTranscoder() in URLParser.cpp. > > + transcoder = uidna_openUTS46(UIDNA_CHECK_BIDI | UIDNA_CHECK_CONTEXTJ | UIDNA_NONTRANSITIONAL_TO_UNICODE | UIDNA_NONTRANSITIONAL_TO_ASCII, &error); > > + RELEASE_ASSERT(U_SUCCESS(error)); > > + RELEASE_ASSERT(transcoder); > > + }); > > + return transcoder; > > +} > > Why are we duplicating this? Can't we just use > URLParser::internationalDomainNameTranscoder()?
Yes, that's a good idea.
Carlos Garcia Campos
Comment 19
2018-01-08 06:00:32 PST
(In reply to Michael Catanzaro from
comment #18
)
> (In reply to Carlos Garcia Campos from
comment #17
) > > Isn't this what WebCore::userVisibleString() does? Can't we implement that > > in WebCore instead? or even make the current objc implementation > > cross-platform? > > I didn't know about this. WebCore::userVisibleString only exists in Cocoa > ports. It's declared in the Objective C++ header WebCoreNSURLExtras.h. It > converts an NSURL* to an NSString*. We can certainly create a new function > with the same name, converting a const char* to a char* instead, and move it > to the platform layer. Then it could be used in the pasteboard code, same as > on Mac, if we ever decide that it'd be desirable to copy the prettified URL > to the clipboard. But I'm not sure that's actually what we want to do with > the pasteboard, and we can always move it down from the API layer to the > platform layer in the future if we do decide to do that. So I think it's > probably better where it is now.
I disagree, there's no point on duplicating the code. The fact that it's cocoa only doesn't mean that the code is 100% cocoa specific. It converts NSURL to NSString probably because it's more convenient, being cocoa specific code, but maybe we can share 99% of the implementation. I would check it first.
Michael Catanzaro
Comment 20
2018-01-08 07:04:10 PST
Comment on
attachment 330687
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=330687&action=review
Good job as usual, I see you even converted from GLib types like GHashTable to WTF types without me having to suggest that. And you updated the WPE headers as well. Great. Like Carlos suggests, this code is indeed lower-level than what we normally have up here in the GTK API layer. So it would indeed make sense to move the implementation down to a lower layer, and call that from the API layer. But instead of implementing a new version of WebCore::userVisibleString, I'm going to suggest moving the code directly into the URL class instead. How about URL::userVisibleURL? You'd declare it like this in the header file Source/WebCore/platform/URL.h: #if USE(SOUP) URL(SoupURI*); GUniquePtr<SoupURI> createSoupURI() const; String userVisibleURL() const; #endif The implementation would go in Source/WebCore/platform/soup/URLSoup.cpp. You can move the webkit_uri_prettify() code there. To convert it to a SoupURI you can just use URL::createSoupURI. Then webkit_uri_prettify() just has to create a new WebCore::URL and, call the userVisibleURL method, and convert the result of that from String to char*. Should look like: g_strdup(url.userVisibleURL().utf8.data()).
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:60 > + WTF::HashMap<guint, guint> scriptMap;
Style: use "unsigned" instead of guint in implementation code (except when working with APIs that use guint).
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:125 > + * webkit_uri_prettify:
Hm, I don't like the name. Perhaps: webkit_uri_decode_for_display()? Or webkit_decode_uri_for_display()?
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:130 > + * WebKit may contain percent-encoded characters or Punycode, which may not be > + * always suitable to display to users. This function provides protection
"which are not generally suitable"
> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitURIUtilities.cpp:31 > + // Latin-only, OK. > + { "
http://abcdef.com/
", "
http://abcdef.com/
" },
Is this list copied from somewhere? Add a link so that it will be possible to understand and update it in the future. At least add a link to the Mozilla design doc, so there's some indication of what you're testing here.
> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitURIUtilities.cpp:69 > + char* prettyURI = webkit_uri_prettify(items[i].input);
You can and should use smart pointers in the API tests: GUniquePtr<char> prettyURI(webkit_uri_prettify(items[i].input)); Then you don't have to free it manually.
Michael Catanzaro
Comment 21
2018-01-08 07:46:27 PST
Also: there is a comment in URLParser::internationalDomainNameTranscoder that needs to be deleted now, since we'll be removing this code from Epiphany. (In reply to Carlos Garcia Campos from
comment #19
)
> I disagree, there's no point on duplicating the code. The fact that it's > cocoa only doesn't mean that the code is 100% cocoa specific. It converts > NSURL to NSString probably because it's more convenient, being cocoa > specific code, but maybe we can share 99% of the implementation. I would > check it first.
I was going to say "looks like a lot of code to just reimplement g_uri_unescape_string()", but at the bottom it also does some map hostnames thing... then I scrolled up... and, OK, the mapping is a similar calculation to what Gabriel has done here. So you're right: it would be better to share this code with Mac, rather than to implement our own version with a separate set of bugs that won't get updated in the future. It looks like WebCoreNSURLExtras implements even more checks than ours, or at least different checks. So, Gabriel, sharing the code with Mac will probably consume the rest of your project hours. Do you think you have enough hours left to work on this? It won't be easy, because you probably don't have a Mac to be able to run the Objective C++ code, and who knows how all these NS APIs work... the cost of having solved hard problems in the past is that people then give you more hard problems. :P I guess the goal would be to (a) understand how WebCore::userVisibleString in WebCoreNSURLExtras.mm works, (b) move the code into either WebCore::URL (or maybe WebCore::URLParser?), eliminating the use of Objective C++ and the NS dependencies, and (c) decide how to handle the differences between its implementation and yours. Then (d) move your test to Tools/TestWebKitAPI/Tests/WebCore/URL.cpp (or maybe URLParser.cpp). I see Carlos has already CCed Alex Christensen from Apple on this bug. He wrote URLParser and is an expert on these URL classes, so if you have questions, he might know the answer.
Carlos Garcia Campos
Comment 22
2018-01-09 00:48:53 PST
Note there are already unit tests in Tools/TestWebKitAPI/Tests/WebCore/cocoa/URLExtras.mm
Carlos Garcia Campos
Comment 23
2018-01-09 00:49:56 PST
Comment on
attachment 330687
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=330687&action=review
>> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:125 >> + * webkit_uri_prettify: > > Hm, I don't like the name. > > Perhaps: webkit_uri_decode_for_display()? Or webkit_decode_uri_for_display()?
Or simply webkit_uri_for_display().
Michael Catanzaro
Comment 24
2018-01-09 07:56:23 PST
(In reply to Carlos Garcia Campos from
comment #23
)
> Or simply webkit_uri_for_display().
Yes, that's a good name!
Gabriel Ivașcu
Comment 25
2018-01-09 13:12:26 PST
(In reply to Michael Catanzaro from
comment #21
)
> So, Gabriel, sharing the code with Mac will probably consume the rest of > your project hours. Do you think you have enough hours left to work on this? > It won't be easy, because you probably don't have a Mac to be able to run > the Objective C++ code, and who knows how all these NS APIs work... the cost > of having solved hard problems in the past is that people then give you more > hard problems. :P I guess the goal would be to (a) understand how > WebCore::userVisibleString in WebCoreNSURLExtras.mm works, (b) move the code > into either WebCore::URL (or maybe WebCore::URLParser?), eliminating the use > of Objective C++ and the NS dependencies, and (c) decide how to handle the > differences between its implementation and yours. Then (d) move your test to > Tools/TestWebKitAPI/Tests/WebCore/URL.cpp (or maybe URLParser.cpp).
Thanks for summarizing it like this. This will not be a trivial task for me because I don't have any knowledge of Objective C++ and NS API, but I'm willing to give it a try. That's also why this is going to probably take me a bit more time than usual. I'll try to come up with a patch as soon as possible. I still need to ask: are you and Carlos sure that this is going to work as intended, i.e. there isn't any Mac-specific code in WebCoreNSURLExtras that would prevent us from moving the implementation to the WebCore layer?
Michael Catanzaro
Comment 26
2018-01-09 20:23:45 PST
Not at all sure. Would it still be fun if you knew in advance that the task is possible? :D I think it should be, though, and you have a couple advantages: * This file contains very few Objective C style function calls, the weird [[ ]] things. I've never been able to wrap my head around those. * It also contains fairly few NS and CF API calls, it looks like almost all can be eliminated by converting from NSString to String, and these APIs are well-documented on the web. That will probably be most of the work, and it looks like the main reason this file is not cross-platform. * There are a bunch of comments to help. But working against you is that (a) reading other people's code is hard, (b) understanding other people's code well enough to rewrite it in another language is harder, (c) this code is complex, and (d) it's not a small amount. So this is certainly a hard task. Let's narrow the task a bit, to something I'm more confident is doable. I would try to start with only a small portion, the most important part of the code to share: the part that handles the decision of whether to decode the IDN or not. The code that's directly in WebCore::userVisibleString looks less-important to share; looks like most of that code is unescaping the string in the manner of g_uri_unescape_string(). Even the code in WebCore::mapHostNames and WebCore::applyHostNameFunctionToURLString looks non-essential to share: it looks like it's mostly splitting up the URL string into the host component (which SoupURI does for us) and then into labels (which your patch already does). That also looks like the hardest portion of the code to rewrite, since it makes heavy use of NSString and NS/CF API calls. So forget about sharing that, at least for now. The part that looks most interesting to share begins in WebCore::mapHostNameWithRange. I think we really want to share WebCore::allCharactersInIDNScriptWhiteList, WebCore::allCharactersAllowedByTLDRules, and all the code higher than that in WebCoreNSURLExtras.mm. Forget about the code below it: that's too much work to rewrite to be cross-platform. (If Carlos isn't satisfied by that, maybe he can convert the rest. :D But I think he'll be pleased to just share the IDN code.) So I think you can keep your code to percent-decode the URL, and your code to split it into labels, and just focus on sharing the IDN-related code. Fortunately, that looks like it uses very little Objective C++ and very few NS/CF API calls. Probably the best place for all that code, once it's rewritten as C++, would be WebCore::URLParser. Or maybe WebCore::URL. I'd start with WebCore::URLParser, knowing it will be easy to move later if requested, once it's out of WebCoreNSURLExtras.h. Now, it does look like Apple's strategy for handling IDN homographs is different from Firefox's. It seems to be based on hardcoded knowledge of the different character set policies of different TLDs. I'll leave you to study that and suggest changes if you think it appropriate, keeping in mind that any changes here would need to be approved by Apple. Usually we just match Apple's behavior, since that's easiest, unless we have a strong reason to try to change it, but if you think the Firefox approach you already implemented is better, that would certainly be worth discussing. P.S. Remember to delete the comment in URLParser::internationalDomainNameTranscoder!
Michael Catanzaro
Comment 27
2018-01-09 20:26:03 PST
That should cut your task roughly in half, BTW, so hopefully that will improve your confidence. ;)
Gabriel Ivașcu
Comment 28
2018-01-15 08:58:50 PST
Created
attachment 331338
[details]
Patch
Gabriel Ivașcu
Comment 29
2018-01-15 09:06:19 PST
Please let me know what you think of the above patch. As far as I could reckon, Apple uses a different policy than Mozilla when it comes to interpreting Unicode characters in URLs, virtually banning almost every Unicode character. I removed all own script code computations from my previous patch and went with what was already provided in WebCoreNSURLExtras.mm, therefore reducing the number of test cases from my previous patch.
Gabriel Ivașcu
Comment 30
2018-01-15 09:09:34 PST
Created
attachment 331339
[details]
Patch
EWS Watchlist
Comment 31
2018-01-15 10:11:05 PST
Comment hidden (obsolete)
Comment on
attachment 331339
[details]
Patch
Attachment 331339
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/6082564
New failing tests: fast/url/user-visible/srb.html fast/url/user-visible/rf.html fast/encoding/idn-security.html fast/url/user-visible/cyrillic-NFD.html
EWS Watchlist
Comment 32
2018-01-15 10:11:06 PST
Comment hidden (obsolete)
Created
attachment 331342
[details]
Archive of layout-test-results from ews101 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-sierra Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 33
2018-01-15 10:17:46 PST
Comment hidden (obsolete)
Comment on
attachment 331339
[details]
Patch
Attachment 331339
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/6082578
New failing tests: fast/url/user-visible/cyrillic-NFD.html fast/url/user-visible/rf.html fast/url/user-visible/srb.html
EWS Watchlist
Comment 34
2018-01-15 10:17:48 PST
Comment hidden (obsolete)
Created
attachment 331343
[details]
Archive of layout-test-results from ews106 for mac-sierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews106 Port: mac-sierra-wk2 Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 35
2018-01-15 10:42:18 PST
Comment hidden (obsolete)
Comment on
attachment 331339
[details]
Patch
Attachment 331339
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/6082602
New failing tests: fast/url/user-visible/cyrillic-NFD.html fast/url/user-visible/rf.html fast/encoding/idn-security.html fast/url/user-visible/srb.html
EWS Watchlist
Comment 36
2018-01-15 10:42:20 PST
Comment hidden (obsolete)
Created
attachment 331345
[details]
Archive of layout-test-results from ews114 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-sierra Platform: Mac OS X 10.12.6
Michael Catanzaro
Comment 37
2018-01-15 11:18:56 PST
That's exactly what I was hoping to see. Alex, does it look like a good approach to you? Now the "fun" part will be investigating why those four tests are failing. Hopefully the problem with the tests is in the cross-platform code, and you'll be able to reproduce them with WebKitGTK+. Try 'Tools/Scripts/run-webkit-tests --force fast/url/user-visible/'. But the failures could also be due to your changes in NSWebCoreURLExtras.mm. In that case, you're unlucky. If you don't think you can guess the problem for the test output, the easiest way to do this is then to upload patches with added debugging and then wait for the bots to spit out the results.
Michael Catanzaro
Comment 38
2018-01-15 11:21:28 PST
(In reply to Michael Catanzaro from
comment #37
)
> Try 'Tools/Scripts/run-webkit-tests --force fast/url/user-visible/'
If you haven't already, you'll first need to: $ Tools/Scripts/update-webkitgtk-libs $ Tools/Scripts/build-webkit --gtk That will get you a release build under WebKitBuild/Release, where the test runner expects it to be. You could also try copying your build directory from wherever it is now (it would be WebKitBuild/GNOME if you've followed my instructions for WebKit/Epiphany development), but I'm not sure if that will work. I haven't found a better way to handle this, I just build WebKit again whenever I need to switch between working on tests and working on Epiphany.
Michael Catanzaro
Comment 39
2018-01-15 11:34:18 PST
Comment on
attachment 331339
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=331339&action=review
r- for the failing tests.
> Source/WebCore/platform/URLParser.cpp:3265 > +String URLParser::ICUConvertHostName(const String& hostName, bool encode, const uint32_t (&IDNScriptBlackList)[(USCRIPT_CODE_LIMIT + 31) / 32])
Hm, I didn't know you could declare array arguments like this, except in templates. Does the size specification actually do anything? I bet it's ignored. A better declaration would be: const uint32_t* IDNScriptBlackList which everyone will understand.
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:46 > + * Return value: (transfer full): @uri suitable for display
Returns: ("Return value" is an older syntax.)
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:52 > + auto URI = WebCore::URL(WebCore::ParsedURLStringTag::ParsedURLString, WTF::String(uri));
WTF::String is hoisted into the global namespace with a using declaration in its header file, so you don't have to qualify it with WTF::. You do need to use String::fromUTF8 to create the string, though, since WebKit unfortunately does not use UTF-8 internally, like GNOME code does. Also, URI in all caps is not a great name. How about "coreURI"?
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:55 > + URI.setPass(WTF::emptyString());
Ditto, you don't need WTF:: here. (See the bottom of WTFString.h.)
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:58 > + URI.setHost(WebCore::URLParser::ICUConvertHostName(String(percentDecodedHost.get()), false));
Use String::fromUTF8
Gabriel Ivașcu
Comment 40
2018-01-17 04:55:22 PST
Seems that each of the failing tests makes use of at least one function from WebCoreNSURLExtras.h that internally calls mapHostNameWithRange(), which is the function from WebCoreNSURLExtras.mm that I've modified. It's odd that so far I got the expected output when testing the new URLParser::ICUConvertHostName() function in WebKitGTK+ against the host names of the test URLs from Mac. I'll keep investigating, it must be something that I'm missing.
Gabriel Ivașcu
Comment 41
2018-01-20 03:47:01 PST
Created
attachment 331840
[details]
Patch
EWS Watchlist
Comment 42
2018-01-20 04:47:53 PST
Comment hidden (obsolete)
Comment on
attachment 331840
[details]
Patch
Attachment 331840
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/6147997
New failing tests: fast/url/user-visible/srb.html fast/url/user-visible/rf.html fast/encoding/idn-security.html fast/url/user-visible/cyrillic-NFD.html
EWS Watchlist
Comment 43
2018-01-20 04:47:54 PST
Comment hidden (obsolete)
Created
attachment 331841
[details]
Archive of layout-test-results from ews102 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews102 Port: mac-sierra Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 44
2018-01-20 04:54:16 PST
Comment hidden (obsolete)
Comment on
attachment 331840
[details]
Patch
Attachment 331840
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/6148008
New failing tests: fast/url/user-visible/srb.html fast/url/user-visible/rf.html fast/url/user-visible/cyrillic-NFD.html
EWS Watchlist
Comment 45
2018-01-20 04:54:18 PST
Comment hidden (obsolete)
Created
attachment 331842
[details]
Archive of layout-test-results from ews106 for mac-sierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews106 Port: mac-sierra-wk2 Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 46
2018-01-20 05:13:23 PST
Comment hidden (obsolete)
Comment on
attachment 331840
[details]
Patch
Attachment 331840
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/6148030
New failing tests: fast/url/user-visible/srb.html fast/url/user-visible/rf.html fast/encoding/idn-security.html fast/url/user-visible/cyrillic-NFD.html
EWS Watchlist
Comment 47
2018-01-20 05:13:24 PST
Comment hidden (obsolete)
Created
attachment 331843
[details]
Archive of layout-test-results from ews114 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-sierra Platform: Mac OS X 10.12.6
Michael Catanzaro
Comment 48
2018-01-24 18:02:34 PST
At this point, instead of guessing in the dark as to what might be wrong, I would add some debug statements and upload them in a patch. I know it's weird to upload a debug patch to Bugzilla, but anything you print to stderr will show up in the layout test results that Build Bot spits out. That way, you can check the stderr output for each of the four failing tests. If you add enough debug then you should eventually find the problem. Alternatively, I could try to get you access to Igalia's Mac server, but I've never used it before, and I bet it would be harder to learn how to use it than to just fight Build Bot the old fashioned way.
Alex Christensen
Comment 49
2018-01-25 11:04:40 PST
Comment on
attachment 331840
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=331840&action=review
> Source/WebCore/platform/URLParser.h:56 > + template<typename CharacterType> static bool isLookalikeCharacter(CharacterType, CharacterType anotherCharacter); > + static String ICUConvertHostName(const String& hostName, bool encode, const uint32_t (&IDNScriptBlackList)[(USCRIPT_CODE_LIMIT + 31) / 32] = { 0 });
I'm not against moving this from a .mm to a sharable location, but the URLParser is an implementation of
https://url.spec.whatwg.org
so let's make a new class for this.
> Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:48 > -static uint32_t IDNScriptWhiteList[(USCRIPT_CODE_LIMIT + 31) / 32]; > +static uint32_t IDNScriptBlackList[(USCRIPT_CODE_LIMIT + 31) / 32];
Why are you making this change? Let's not change the functionality of the ObjC code.
> Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:639 > - std::optional<UChar32> previousCodePoint; > + UChar32 previousCodePoint = INT32_MAX;
This doesn't look good. We should use std::optional instead of having a sentinel value. Also, there's a signed/unsigned mismatch, but we shouldn't do this anyway.
Gabriel Ivașcu
Comment 50
2018-01-27 04:14:01 PST
(In reply to Alex Christensen from
comment #49
)
> Comment on
attachment 331840
[details]
> > Source/WebCore/platform/URLParser.h:56 > > + template<typename CharacterType> static bool isLookalikeCharacter(CharacterType, CharacterType anotherCharacter); > > + static String ICUConvertHostName(const String& hostName, bool encode, const uint32_t (&IDNScriptBlackList)[(USCRIPT_CODE_LIMIT + 31) / 32] = { 0 }); > > I'm not against moving this from a .mm to a sharable location, but the > URLParser is an implementation of
https://url.spec.whatwg.org
so let's make > a new class for this.
Hmm OK but that will make a class with only two static methods.
> > Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:48 > > -static uint32_t IDNScriptWhiteList[(USCRIPT_CODE_LIMIT + 31) / 32]; > > +static uint32_t IDNScriptBlackList[(USCRIPT_CODE_LIMIT + 31) / 32]; > > Why are you making this change? Let's not change the functionality of the > ObjC code.
I changed this so that callers that don't use script white/black lists wouldn't have to create explicitly a whitelist with all bits set to one to pass to the function. This way, they would leave the function's default blacklist's value of all zeros, i.e. all scripts are allowed. I can change it back however.
> > Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:639 > > - std::optional<UChar32> previousCodePoint; > > + UChar32 previousCodePoint = INT32_MAX; > > This doesn't look good. We should use std::optional instead of having a > sentinel value. Also, there's a signed/unsigned mismatch, but we shouldn't > do this anyway.
OK. But UChar32 is defined as int32_t in WTF/icu/unicode/umachine.h.
Gabriel Ivașcu
Comment 51
2018-01-27 04:15:20 PST
(In reply to Michael Catanzaro from
comment #48
)
> At this point, instead of guessing in the dark as to what might be wrong, I > would add some debug statements and upload them in a patch. I know it's > weird to upload a debug patch to Bugzilla, but anything you print to stderr > will show up in the layout test results that Build Bot spits out. That way, > you can check the stderr output for each of the four failing tests. If you > add enough debug then you should eventually find the problem.
I will give it one more try before resorting to this.
Michael Catanzaro
Comment 52
2018-01-27 09:50:02 PST
(In reply to Gabriel Ivașcu from
comment #50
)
> Hmm OK but that will make a class with only two static methods.
In that case, I would use a namespace instead of a class. It's fine to have global functions outside of any class in C++. The namespace is even optional, but I think it's helpful to group related functions together in cases like this.
> > > Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:48 > I changed this so that callers that don't use script white/black lists > wouldn't have to create explicitly a whitelist with all bits set to one to > pass to the function. This way, they would leave the function's default > blacklist's value of all zeros, i.e. all scripts are allowed. I can change > it back however.
I think the change is good (as long as it's not what's causing the tests to break ;)
> > > Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:639 > > > - std::optional<UChar32> previousCodePoint; > > > + UChar32 previousCodePoint = INT32_MAX; > > > > This doesn't look good. We should use std::optional instead of having a > > sentinel value. Also, there's a signed/unsigned mismatch, but we shouldn't > > do this anyway. > > OK. But UChar32 is defined as int32_t in WTF/icu/unicode/umachine.h.
The U means "Unicode", not "unsigned"... but yes, much better to use std::optional here instead. Also in hasCharactersInIDNScriptBlackList as well. (In reply to Alex Christensen from
comment #49
)
> I'm not against moving this from a .mm to a sharable location, but the > URLParser is an implementation of
https://url.spec.whatwg.org
so let's make > a new class for this.
Gabriel, the tricky part here is that you cannot create new cross-platform files without breaking the Mac build. You can add the new file in WebCore/CMakeLists.txt, but you'll need help from Alex to modify your patch in order to add the new file to the XCode build system, or the Mac EWS is going to be red, which will not be helpful in trying to figure out if your patch passes the Mac tests or not. So for now, my recommendation is to move the code to the URL class instead of URLParser, and make sure it works there. Then, only once it passes tests on Mac, we can move it into a new file if Alex wants to help with that. I kinda think that URL might be a good permanent home for this code, since that keeps it out of URLParser, and being able to call url.displayURL() would be nice, but we'll do whatever Alex prefers.
Gabriel Ivașcu
Comment 53
2018-01-27 10:50:41 PST
Created
attachment 332474
[details]
Patch
EWS Watchlist
Comment 54
2018-01-27 11:51:10 PST
Comment hidden (obsolete)
Comment on
attachment 332474
[details]
Patch
Attachment 332474
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/6234447
New failing tests: fast/url/user-visible/cyrillic-NFD.html fast/url/user-visible/rf.html fast/encoding/idn-security.html fast/url/user-visible/srb.html
EWS Watchlist
Comment 55
2018-01-27 11:51:11 PST
Comment hidden (obsolete)
Created
attachment 332477
[details]
Archive of layout-test-results from ews103 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-sierra Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 56
2018-01-27 11:56:37 PST
Comment hidden (obsolete)
Comment on
attachment 332474
[details]
Patch
Attachment 332474
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/6234452
New failing tests: fast/url/user-visible/srb.html fast/url/user-visible/rf.html fast/url/user-visible/cyrillic-NFD.html
EWS Watchlist
Comment 57
2018-01-27 11:56:38 PST
Comment hidden (obsolete)
Created
attachment 332478
[details]
Archive of layout-test-results from ews104 for mac-sierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-sierra-wk2 Platform: Mac OS X 10.12.6
EWS Watchlist
Comment 58
2018-01-27 12:17:00 PST
Comment hidden (obsolete)
Comment on
attachment 332474
[details]
Patch
Attachment 332474
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/6234480
New failing tests: fast/url/user-visible/srb.html fast/url/user-visible/rf.html fast/encoding/idn-security.html fast/url/user-visible/cyrillic-NFD.html
EWS Watchlist
Comment 59
2018-01-27 12:17:02 PST
Comment hidden (obsolete)
Created
attachment 332480
[details]
Archive of layout-test-results from ews114 for mac-sierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-sierra Platform: Mac OS X 10.12.6
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 60
2018-06-11 03:37:41 PDT
Comment on
attachment 332474
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=332474&action=review
> Source/WebCore/platform/URLParser.cpp:3073 > + if (previousCodePoint != INT32_MAX && URLParser::isLookalikeCharacter(c, previousCodePoint))
Adding the `previousCodePoint != INT32_MAX` seems likely to change the behaviour here.
> Source/WebCore/platform/URLParser.cpp:3290 > + if (allCharactersInIDNScriptWhiteList(outputBuffer, outputLength, IDNScriptWhiteList) && allCharactersAllowedByTLDRules(outputBuffer, outputLength))
In inverting the condition, you changed !(!A && !B) to (A && B) where it should have been (A || B); this is (at least one of the) bug(s) that caused the test failures.
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 61
2018-06-11 03:37:48 PDT
View in context:
https://bugs.webkit.org/attachment.cgi?id=332474&action=review
> Source/WebCore/platform/URLParser.cpp:3073 > + if (previousCodePoint != INT32_MAX && URLParser::isLookalikeCharacter(c, previousCodePoint))
Adding the `previousCodePoint != INT32_MAX` seems likely to change the behaviour here.
> Source/WebCore/platform/URLParser.cpp:3290 > + if (allCharactersInIDNScriptWhiteList(outputBuffer, outputLength, IDNScriptWhiteList) && allCharactersAllowedByTLDRules(outputBuffer, outputLength))
In inverting the condition, you changed !(!A && !B) to (A && B) where it should have been (A || B); this is (at least one of the) bug(s) that caused the test failures.
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 62
2018-06-11 06:56:49 PDT
(I'm preparing a new patch for review.)
Gabriel Ivașcu
Comment 63
2018-06-12 02:27:42 PDT
(In reply to Ms2ger from
comment #60
)
> In inverting the condition, you changed !(!A && !B) to (A && B) where it > should have been (A || B); this is (at least one of the) bug(s) that caused > the test failures.
You're so right, too bad for me for missing this. (In reply to Ms2ger from
comment #62
)
> (I'm preparing a new patch for review.)
Thank you for doing this!
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 64
2018-06-12 09:03:11 PDT
Created
attachment 342542
[details]
Patch
EWS Watchlist
Comment 65
2018-06-12 09:06:02 PDT
Comment hidden (obsolete)
Attachment 342542
[details]
did not pass style-queue: WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/webkit2.h" ERROR: Source/WebCore/platform/URLParser.cpp:2928: A case label should not be indented, but line up with its switch statement. [whitespace/indent] [4] ERROR: Source/WebCore/platform/URLParser.cpp:3120: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3140: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3149: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3161: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3171: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3184: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3194: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3205: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3214: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3226: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3238: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3250: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3262: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3274: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WebCore/platform/URLParser.cpp:3304: One line control clauses should not use braces. [whitespace/braces] [4] ERROR: Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:65: Missing space inside { }. [whitespace/braces] [5] ERROR: Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:73: One line control clauses should not use braces. [whitespace/braces] [4] WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/webkit.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitURIUtilities.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitURIUtilities.h" ERROR: Source/WebCore/ChangeLog:10: You should remove the 'No new tests' and either add and list tests, or explain why no new tests were possible. [changelog/nonewtests] [5] Total errors found: 19 in 19 files If any of these errors are false positives, please file a bug against check-webkit-style.
Michael Catanzaro
Comment 66
2018-06-12 09:40:21 PDT
(In reply to Build Bot from
comment #65
)
>
Attachment 342542
[details]
did not pass style-queue: > > > WARNING: File exempt from style guide. Skipping: > "Source/WebKit/UIProcess/API/gtk/webkit2.h" > ERROR: Source/WebCore/platform/URLParser.cpp:2928: A case label should not > be indented, but line up with its switch statement. [whitespace/indent] [4] > ERROR: Source/WebCore/platform/URLParser.cpp:3120: Tests for true/false, > null/non-null, and zero/non-zero should all be done without equality > comparisons. [readability/comparison_to_zero] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3140: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3149: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3161: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3171: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3184: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3194: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3205: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3214: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3226: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3238: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3250: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3262: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3274: One space before end of > line comments [whitespace/comments] [5] > ERROR: Source/WebCore/platform/URLParser.cpp:3304: One line control clauses > should not use braces. [whitespace/braces] [4]
I know you're just moving code, but best fix these nits at the same time.
> ERROR: Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:65: Missing > space inside { }. [whitespace/braces] [5] > ERROR: Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:73: One line > control clauses should not use braces. [whitespace/braces] [4] > WARNING: File exempt from style guide. Skipping: > "Source/WebKit/UIProcess/API/wpe/webkit.h" > WARNING: File exempt from style guide. Skipping: > "Source/WebKit/UIProcess/API/gtk/WebKitURIUtilities.h" > WARNING: File exempt from style guide. Skipping: > "Source/WebKit/UIProcess/API/wpe/WebKitURIUtilities.h" > ERROR: Source/WebCore/ChangeLog:10: You should remove the 'No new tests' > and either add and list tests, or explain why no new tests were possible. > [changelog/nonewtests] [5] > Total errors found: 19 in 19 files
And the rest of these ERRORs, too.
Michael Catanzaro
Comment 67
2018-06-12 10:07:54 PDT
Comment on
attachment 342542
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=342542&action=review
Thanks a bunch for resurrecting this work. The main issue is that Alex requested that we not add this to URLParser: (In reply to Alex Christensen from
comment #49
)
> I'm not against moving this from a .mm to a sharable location, but the > URLParser is an implementation of
https://url.spec.whatwg.org
so let's make > a new class for this.
So some new home will be needed, e.g. URLHelpers.cpp or URLDecoder.cpp, something like that. r- for this
> Source/WebCore/ChangeLog:8 > + This code is based almost entirely on code by Gabriel Ivascu.
We often credit multiple authors on the authorship line: 2018-06-12 Ms2ger <
Ms2ger@igalia.com
> and Gabriel Ivascu <
gabrielivascu@gnome.org
>
> Source/WebCore/platform/URLParser.cpp:2886 > + > +
Let's drop down to one blank line between functions.
> Source/WebCore/platform/URLParser.cpp:2910 > + > + > +
Ditto.
> Source/WebCore/platform/URLParser.h:56 > + static String ICUConvertHostName(const String& hostName, bool encode, const uint32_t (&IDNScriptWhiteList)[(USCRIPT_CODE_LIMIT + 31) / 32], bool* error);
This function name is rather ambiguous... how about DecodeHostnameForDisplay?
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:43 > + * against IDN homograph attacks so in some cases the host part of the returned
comma after "attacks"
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:48 > +gchar* webkit_uri_for_display(const gchar* uri)
We might want to bikeshed about the name. It's fine with me, but please ask Carlos Garcia if he has a preference here.
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:52 > + auto coreURI = WebCore::URL(WebCore::URL(), String::fromUTF8(uri));
I think we normally use bracket initialization for URLs: auto coreURI = WebCore::URL { { }, String::fromUTF8(uri) };
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:54 > + if (!coreURI.isValid()) > + return g_strdup(uri);
Hm, we should document that if the input is not a valid URI, it returns the original input. I wonder if it would be better to return nullptr when the input is not a valid URI, though. Probably? I'd like an opinion from Carlos Garcia on this. The behavior of strdup()ing the original input is coming from ephy_uri_decode(), which is designed to always return a value for convenience, but it might be better for public API to indicate the error. In that case, the return value would need to be annotated as (nullable) and everywhere we currently return g_strdup(uri) we would need to instead return nullptr.
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:60 > + // Remove password and percent-decode host name. > + coreURI.setPass(emptyString()); > + auto soupURI = coreURI.createSoupURI(); > + if (!soupURI.get()->host) > + return g_strdup(uri);
If the URI contains a password but not a hostname, which is valid, then we fail to remove the password. Oops. That's true for the subsequent early return on line 69, as well. Returning null would avoid this.
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:65 > + uint32_t IDNScriptWhiteList[(USCRIPT_CODE_LIMIT + 31) / 32] = {};
The third parameter of ICUConvertHostName should be declared to use a typedef (or using statement), then this line could become something much saner, like: ICUConvertHostnameWhitelist whitelist = {};
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:73 > + if (convertedHostName.isNull()) { > + convertedHostName = percentDecodedHost; > + }
No brackets around a one-line condition
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:80 > + // Now, decode any percent-encoded characters in the URI. If there are null > + // characters or escaped slashes, this returns nullptr, so just display the > + // encoded URI in that case.
I'm confused, why are percent-decoding the hostname twice? Epiphany feeds the percent-encoded URI into ICU to decode the punycode, then percent-decodes. I think it did not work properly when doing it the other way around. But here you are doing it twice, once before decoding punycode and once after. What is the correct approach? This would also be simpler if we return nullptr on error.
Michael Catanzaro
Comment 68
2018-06-12 10:10:25 PDT
(In reply to Michael Catanzaro from
comment #67
)
> This function name is rather ambiguous... how about DecodeHostnameForDisplay?
I guess that name would only be appropriate if the function were to perform the percent-decoding as well. Perhaps simply: DecodePunycode?
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 69
2018-06-18 07:18:07 PDT
(In reply to Michael Catanzaro from
comment #67
)
> Comment on
attachment 342542
[details]
> Patch > > > Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:52 > > + auto coreURI = WebCore::URL(WebCore::URL(), String::fromUTF8(uri)); > > I think we normally use bracket initialization for URLs: > > auto coreURI = WebCore::URL { { }, String::fromUTF8(uri) };
Interestingly, this interprets the { } as initializing a WebCore::ParsedURLStringTag. Using WebCore::URL { } helps. (Yay, C++.)
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 70
2018-06-19 08:37:14 PDT
Created
attachment 343057
[details]
Patch
EWS Watchlist
Comment 71
2018-06-19 08:39:59 PDT
Comment hidden (obsolete)
Attachment 343057
[details]
did not pass style-queue: WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/webkit2.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/webkit.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitURIUtilities.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitURIUtilities.h" ERROR: Source/WebCore/ChangeLog:10: You should remove the 'No new tests' and either add and list tests, or explain why no new tests were possible. [changelog/nonewtests] [5] Total errors found: 1 in 21 files If any of these errors are false positives, please file a bug against check-webkit-style.
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 72
2018-06-19 09:04:28 PDT
Created
attachment 343059
[details]
Patch
EWS Watchlist
Comment 73
2018-06-19 10:50:09 PDT
Comment hidden (obsolete)
Comment on
attachment 343059
[details]
Patch
Attachment 343059
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
http://webkit-queues.webkit.org/results/8248602
New failing tests: imported/w3c/web-platform-tests/FileAPI/blob/Blob-constructor.html
EWS Watchlist
Comment 74
2018-06-19 10:50:10 PDT
Comment hidden (obsolete)
Created
attachment 343066
[details]
Archive of layout-test-results from ews126 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews126 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.4
Michael Catanzaro
Comment 75
2018-06-19 13:57:01 PDT
Comment on
attachment 343059
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=343059&action=review
Great! Just a couple more comments from me regarding the public functions of the new URLHelpers class. Other than that, it looks ready.
> Source/WebCore/ChangeLog:20 > + * platform/URLParser.cpp: > + (WebCore::isArmenianLookalikeCharacter): code moved without semantic changes. > + (WebCore::isArmenianScriptCharacter): code moved without semantic changes. > + (WebCore::isASCIIDigitOrValidHostCharacter): code moved without semantic changes. > + (WebCore::URLParser::isLookalikeCharacter): code moved without semantic changes. > + (WebCore::allCharactersInIDNScriptWhiteList): code moved without semantic changes. > + (WebCore::isSecondLevelDomainNameAllowedByTLDRules): code moved without semantic changes. > + (WebCore::isRussianDomainNameCharacter): code moved without semantic changes. > + (WebCore::allCharactersAllowedByTLDRules): code moved without semantic changes. > + (WebCore::URLParser::ICUConvertHostName): add. > + * platform/URLParser.h: add signatures.
The ChangeLog is outdated and should be regenerated with prepare-ChangeLog.
> Source/WebCore/platform/URLHelpers.h:34 > +namespace std { > +template<typename> class optional; > +}
I'm glad you tried to avoid a forward declaration, but this is too much work IMO, I would just #include <wtf/Optional.h>
> Source/WebCore/platform/URLHelpers.h:43 > + static bool isLookalikeCharacter(std::optional<UChar32>, UChar32); > + static String decodePunycode(const String& hostName, bool encode, const ICUConvertHostnameWhitelist& IDNScriptWhiteList, bool* error);
Hm, this is still not great... the function signatures are designed as if they're file-private (which they used to be), so they are complex to make them easier to use within their file, but now they are public and it's more important for the signatures to be easy to use. What are the two arguments to isLookalikeCharacter? They should have descriptive names in the header (previousCodePoint and charCode). Now that you've renamed it to decodePunycode, I notice that function named decodePunycode should really probably not have a bool encode parameter. This public interface should probably offer a separate encodePunycode function, with the bool parameter removed. And lastly, returning error via a bool out parameter is pretty weird. Shouldn't it at least be a reference, rather than a pointer?
> Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:111 > if (!readIDNScriptWhiteListFile([bundle pathForResource:@"IDNScriptWhiteList" ofType:@"txt"])) > CRASH();
What's this whitelist file? I see we do not use the whitelist. Is that bad?
Tim Horton
Comment 76
2018-06-19 15:33:07 PDT
Comment on
attachment 343059
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=343059&action=review
> Source/WebCore/platform/URLHelpers.cpp:12 > + * Copyright (C) 2018 Metrological Group B.V. > + * Copyright (C) 2018 Igalia S.L. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * 1. Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * 2. Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in the > + * documentation and/or other materials provided with the distribution.
You can't replace the copyright header on this much moved code. You can add to it, but not replace it.
Tim Horton
Comment 77
2018-06-19 15:38:28 PDT
Comment on
attachment 343059
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=343059&action=review
> Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:137 > + NSString* substring = [string substringWithRange:range];
All your NSString stars are on the wrong side (C++ on the left, ObjC on the right). It's a little weird, but it's what we do.
> Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:145 > + NSString* convertedString_(convertedString);
What is this line for? Why not just return convertedString and/or cast at the return site if needed?
Michael Catanzaro
Comment 78
2018-06-25 08:55:49 PDT
Comment on
attachment 343059
[details]
Patch This is really close, but not quite ready for r+ as there is outstanding review feedback from myself and Tim.
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 79
2018-06-28 06:07:57 PDT
Created
attachment 343806
[details]
Patch
Michael Catanzaro
Comment 80
2018-06-28 12:19:37 PDT
LGTM, except for one unresolved question: (In reply to Michael Catanzaro from
comment #75
)
> > Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:111 > > if (!readIDNScriptWhiteListFile([bundle pathForResource:@"IDNScriptWhiteList" ofType:@"txt"])) > > CRASH(); > > What's this whitelist file? I see we do not use the whitelist. Is that bad?
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 81
2018-06-28 23:58:51 PDT
(In reply to Michael Catanzaro from
comment #80
)
> LGTM, except for one unresolved question: > > (In reply to Michael Catanzaro from
comment #75
) > > > Source/WebCore/platform/mac/WebCoreNSURLExtras.mm:111 > > > if (!readIDNScriptWhiteListFile([bundle pathForResource:@"IDNScriptWhiteList" ofType:@"txt"])) > > > CRASH(); > > > > What's this whitelist file? I see we do not use the whitelist. Is that bad?
That's not really clear to me; maybe Gabriel remembers?
Gabriel Ivașcu
Comment 82
2018-06-29 07:31:22 PDT
(In reply to Ms2ger from
comment #81
)
> > > What's this whitelist file? I see we do not use the whitelist. Is that bad? > > That's not really clear to me; maybe Gabriel remembers?
IIRC the current Objective C++ code that runs on Mac and iOS reads the whitelisted characters from a file on disk and builds a bit array used to verify that all characters from a string from a given script are allowed. In WebKitGTK+ I think all characters can be considered whitelisted for now so the bit array can be filled with all ones.
Michael Catanzaro
Comment 83
2018-06-29 07:39:51 PDT
Comment on
attachment 343806
[details]
Patch I found it at Source/WebCore/Resources and it looks important. :) I think we should add it to WebKitResources in Source/WebKit/PlatformGTK.cmake and Source/WebKit/PlatformWPE.cmake and load it using g_resources_lookup_data(). Almost there....
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 84
2018-07-04 09:15:45 PDT
Created
attachment 344291
[details]
Patch
EWS Watchlist
Comment 85
2018-07-04 11:03:50 PDT
Comment hidden (obsolete)
Comment on
attachment 344291
[details]
Patch
Attachment 344291
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
https://webkit-queues.webkit.org/results/8437543
New failing tests: animations/needs-layout.html
EWS Watchlist
Comment 86
2018-07-04 11:03:52 PDT
Comment hidden (obsolete)
Created
attachment 344294
[details]
Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.4
EWS Watchlist
Comment 87
2018-07-04 13:23:11 PDT
Comment hidden (obsolete)
Comment on
attachment 344291
[details]
Patch
Attachment 344291
[details]
did not pass win-ews (win): Output:
https://webkit-queues.webkit.org/results/8438365
New failing tests: http/tests/security/video-poster-cross-origin-crash2.html
EWS Watchlist
Comment 88
2018-07-04 13:23:23 PDT
Comment hidden (obsolete)
Created
attachment 344298
[details]
Archive of layout-test-results from ews201 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews201 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Carlos Garcia Campos
Comment 89
2018-07-05 00:42:22 PDT
I still don't understand why we need to use internal parts of WebCoreNSURLExtras. What's the difference between userVisibleString and webkit_uri_for_display? is that we also decoded percentage encoded parts? To make all this simpler I would split this patch in two: - The first patch would do the WebCore refactoring, making userVisibleString() cross-platform. We could use platform specific bits for reading the whitelist file, but everything done in WebCore. This patch should also make Tools/TestWebKitAPI/Tests/WebCore/cocoa/URLExtras.mm cross-platform, at least the tests for userVisibleString(). And we could also make Internals::userVisibleString cross platform and enable the layout tests using it for al platforms. There are a few other places in WebCore cross-platform code where we might remove cocoa ifdefs. - The second patch simply uses WebCore::userVisibleString() to expose webkit_uri_for_display(). This should include its own unit test since it seems we also decode the percentage encoded parts. I will add specific comments about this part in this bug.
Carlos Garcia Campos
Comment 90
2018-07-05 01:07:09 PDT
Comment on
attachment 344291
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=344291&action=review
> Source/WebKit/SourcesGTK.txt:169 > +UIProcess/API/glib/WebKitURIUtilities.cpp @no-unify
I think this could be shared, you might want to use it in a web extension. I would move it to Shared/API/glib.
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:36 > + **/
**/ -> */
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:40 > +static void readIDNScriptWhiteListFile()
As I commented before I think this could also be refactored to be done by WebCore in cross-platform code. We only need to had a function to read the data that would be platform specific.
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:105 > + **/
Since: 2.22 **/ -> */
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:121 > + if (!coreURI.isValid()) > + return nullptr; > + > + // Remove password and percent-decode host name. > + coreURI.setPass(emptyString()); > + auto soupURI = coreURI.createSoupURI(); > + if (!soupURI.get()->host) > + return nullptr; > + > + GUniquePtr<gchar> percentDecodedHostChars(soup_uri_decode(soupURI.get()->host)); > + auto percentDecodedHost = String::fromUTF8(percentDecodedHostChars.get());
I wonder if we could use WebCore::URLParser for this instead of libsoup to avoid the utf16 -> utf8 -> utf16 conversions
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:128 > + auto convertedHostName = WebCore::URLHelpers::decodePunycode(percentDecodedHost, IDNScriptWhiteList, error); > + if (error) > + return nullptr;
Maybe we could add a GError** parameter to provide more information about what failed.
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:137 > + // Now, decode any percent-encoded characters in the URI. > + GUniquePtr<char> percentEncodedURI(soup_uri_to_string(soupURI.get(), FALSE));
I don't understand this. The comment says we are decoding the uri, but the variable name says it's encoded. Does soup_uri_to_string() decode percentage encoded parts or not? Why do we need to do the hostname part separately? Can't we simply call userVisibleString() and decode the resulting string?
> Source/WebKit/UIProcess/API/gtk/WebKitURIUtilities.h:20 > +#if !defined(__WEBKIT2_H_INSIDE__) && !defined(WEBKIT2_COMPILATION)
&& !defined(__WEBKIT_WEB_EXTENSION_H_INSIDE__) if we make this shared.
> Source/WebKit/UIProcess/API/wpe/WebKitURIUtilities.h:20 > +#if !defined(__WEBKIT_H_INSIDE__) && !defined(WEBKIT2_COMPILATION)
Ditto.
> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitURIUtilities.cpp:53 > + { "
http://www.%7Bexample%7D.com/
", "
http://www.{example}.com/
" }, > + { "
http://example.com/a%2Fb
", nullptr }, // '/' in path needs to remain encoded.
I would like to see more test cases about the percent-decoding thing which is what we make different compared to WebCore:userVisibleString().
Michael Catanzaro
Comment 91
2018-07-17 18:56:30 PDT
This looks at risk of slipping; keep in mind that API freeze begins July 30.
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 92
2018-12-07 05:14:24 PST
Created
attachment 356804
[details]
Patch
EWS Watchlist
Comment 93
2018-12-07 05:16:48 PST
Comment hidden (obsolete)
Attachment 356804
[details]
did not pass style-queue: ERROR: Source/WTF/wtf/glib/URLHelpersGlib.cpp:30: You should add a blank line after implementation file's own header. [build/include_order] [4] WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/webkit2.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitURIUtilities.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitURIUtilities.h" ERROR: Source/WTF/wtf/URLHelpers.h:32: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/WTF/wtf/URLHelpers.h:43: The parameter name "string" adds no information, so it should be removed. [readability/parameter_name] [5] ERROR: Source/WTF/wtf/URLHelpers.h:44: The parameter name "string" adds no information, so it should be removed. [readability/parameter_name] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:108: A case label should not be indented, but line up with its switch statement. [whitespace/indent] [4] ERROR: Source/WTF/wtf/URLHelpers.cpp:339: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:359: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:368: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:380: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:390: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:403: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:413: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:424: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:433: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:445: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:457: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:469: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:481: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:493: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:518: One line control clauses should not use braces. [whitespace/braces] [4] ERROR: Source/WTF/wtf/URLHelpers.cpp:765: Declaration has space between type name and * in char *q [whitespace/declaration] [3] ERROR: Source/WTF/wtf/URLHelpers.cpp:767: Declaration has space between type name and * in const unsigned char *p [whitespace/declaration] [3] ERROR: Source/WTF/wtf/URLHelpers.cpp:803: Declaration has space between type name and * in char *p [whitespace/declaration] [3] ERROR: Source/WTF/wtf/URLHelpers.cpp:805: Declaration has space between type name and * in char *q [whitespace/declaration] [3] WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/webkit.h" Total errors found: 24 in 30 files If any of these errors are false positives, please file a bug against check-webkit-style.
EWS Watchlist
Comment 94
2018-12-07 07:09:07 PST
Comment hidden (obsolete)
Comment on
attachment 356804
[details]
Patch
Attachment 356804
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
https://webkit-queues.webkit.org/results/10304573
New failing tests: imported/w3c/web-platform-tests/service-workers/service-worker/register-closed-window.https.html
EWS Watchlist
Comment 95
2018-12-07 07:09:10 PST
Comment hidden (obsolete)
Created
attachment 356808
[details]
Archive of layout-test-results from ews122 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews122 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Michael Catanzaro
Comment 96
2018-12-07 19:32:46 PST
From WPE EWS: lib/libWTF.a(lib/../Source/WTF/wtf/CMakeFiles/WTF.dir/URLHelpers.cpp.o):URLHelpers.cpp:function WTF::mapHostName(WTF::String, std::optional<WTF::String (&)(WTF::String)>, bool*): error: undefined reference to 'WTF::loadIDNScriptWhiteList()'
Michael Catanzaro
Comment 97
2018-12-07 20:27:39 PST
Comment on
attachment 356804
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=356804&action=review
Amazing monster patch! I have a lot of comments, but don't worry: all are for quick trivial stuff. I sense the finish line is very near. My only important complaint is to fix the build failures related to loadIDNScriptWhiteList, which I explain how to do below. The red ios-sim EWS looks unrelated and can be ignored.
> LayoutTests/TestExpectations:-71 > # These tests don't have to be platform-specific, but they are only implemented on Mac now. > -fast/url/user-visible [ Skip ]
Very nice! Good that you saw the opportunity to implement the TestController function.
> Source/WTF/wtf/PlatformGTK.cmake:29 > + glib/URLHelpersGlib.cpp
URLHelpersGLib.cpp (capital L) We should probably rename other files ending in Glib.cpp at some point since I know there's a lot of inconsistency currently.
> Source/WTF/wtf/URLHelpers.cpp:41 > +// Needs to be big enough to hold an IDN-encoded name. > +// For host names bigger than this, we won't do IDN encoding, which is almost certainly OK. > +#define HOST_NAME_BUFFER_LENGTH 2048 > +#define URL_BYTES_BUFFER_LENGTH 2048
Since this is C++ now you should use const unsigned or const int instead of #define, and move them into the WTF namespace.
> Source/WTF/wtf/URLHelpers.cpp:43 > +static uint32_t IDNScriptWhiteList[(USCRIPT_CODE_LIMIT + 31) / 32];
This should also go inside the WTF namespace.
> Source/WTF/wtf/URLHelpers.cpp:67 > +} > + > + > +template<typename CharacterType> inline bool isASCIIDigitOrValidHostCharacter(CharacterType charCode)
One blank line between functions.
> Source/WTF/wtf/URLHelpers.cpp:92 > +} > + > + > + > +static bool isLookalikeCharacter(std::optional<UChar32> previousCodePoint, UChar32 charCode) > +{
Ditto regarding too many blank lines. Also, careful not to copy optionals around unnecessarily. This should be const std::optional<UChar32>& previousCodePoint, or std::optional<UChar32>&& previousCodePoint if you add the WTFMove() at its callsite. Probably the later, but whichever you prefer is fine.
> Source/WTF/wtf/URLHelpers.cpp:284 > +static bool allCharactersInIDNScriptWhiteList(const UChar *buffer, int32_t length)
const UChar* buffer
> Source/WTF/wtf/URLHelpers.cpp:341 > +#define CHECK_RULES_IF_SUFFIX_MATCHES(suffix, function) \ > + { \ > + static const int32_t suffixLength = sizeof(suffix) / sizeof(suffix[0]); \ > + if (length > suffixLength && 0 == memcmp(buffer + length - suffixLength, suffix, sizeof(suffix))) \ > + return isSecondLevelDomainNameAllowedByTLDRules(buffer, length - suffixLength, function); \ > + }
Eh, couldn't this become a function instead of a macro?
> Source/WTF/wtf/URLHelpers.cpp:507 > +String mapHostName(String string, std::optional<DecodeFunction> decodeFunction, bool *error)
bool* error (the * hangs on the type). Or better: bool& error, unless the error parameter is really supposed to be optional, which I doubt. Then I think we need to not copy the String or std::optional by value here. So const String& string and const std::optional<DecodeFunction>& decodeFunction would be good safe declarations. But if the values are being "sunk" then String&& string and std::optional<DecodeFunction>>&& decodeFunction would be better. Looks like they are being sunk throughout this file, but I see that they are APIs accessible to other files, so that swings the balance in favor of const-ref. So that leaves us with: String mapHostName(const String& string, const std::optional<DecodeFunction> decodeFunction, bool& error)
> Source/WTF/wtf/URLHelpers.cpp:511 > + return String();
This is old C++. Modern C++: return { };
> Source/WTF/wtf/URLHelpers.cpp:514 > + return String();
Ditto.
> Source/WTF/wtf/URLHelpers.cpp:528 > + return String();
Ditto.
> Source/WTF/wtf/URLHelpers.cpp:532 > + return String();
Ditto.
> Source/WTF/wtf/URLHelpers.cpp:535 > + return String();
Ditto.
> Source/WTF/wtf/URLHelpers.cpp:542 > +static void collectRangesThatNeedMapping(String string, unsigned location, unsigned length, MappingRangesVector *array, std::optional<DecodeFunction> decodeFunction)
const String& string MappingRangesVector& array -- the * hangs on the type, and then since you dereference it unconditionally, it's not optional/nullable and should use & instead of * const std::optional<DecodeFunction>& decodeFunction
> Source/WTF/wtf/URLHelpers.cpp:545 > + // Therefore, we use nil to indicate no mapping here and an empty array to indicate error.
It's not Objective C++ anymore so "nil" is no longer right.
> Source/WTF/wtf/URLHelpers.cpp:561 > +static void applyHostNameFunctionToMailToURLString(String string, std::optional<DecodeFunction> decodeFunction, MappingRangesVector *array)
Ditto.
> Source/WTF/wtf/URLHelpers.cpp:633 > +static void applyHostNameFunctionToURLString(String string, std::optional<DecodeFunction> decodeFunction, MappingRangesVector *array)
Ditto.
> Source/WTF/wtf/URLHelpers.cpp:692 > +String mapHostNames(String string, std::optional<DecodeFunction> decodeFunction)
Ditto.
> Source/WTF/wtf/URLHelpers.cpp:706 > + return String();
return { };
> Source/WTF/wtf/URLHelpers.cpp:759 > + return emptyString();
return { };
> Source/WTF/wtf/URLHelpers.cpp:763 > + Vector<char, URL_BYTES_BUFFER_LENGTH> after(bufferLength); // large enough to %-escape every character
Comment should be a complete sentence.
> Source/WTF/wtf/URLHelpers.cpp:765 > + char *q = after.data();
char* q
> Source/WTF/wtf/URLHelpers.cpp:767 > + const unsigned char *p = before;
const unsigned char* p
> Source/WTF/wtf/URLHelpers.cpp:803 > + char *p = after.data() + bufferLength - afterlength - 1;
char* p
> Source/WTF/wtf/URLHelpers.cpp:805 > + char *q = after.data();
char* q
> Source/WTF/wtf/URLHelpers.cpp:846 > + return String();
return { };
> Source/WTF/wtf/URLHelpers.h:37 > +using DecodeFunction = String(&)(String);
WTF is a big namespace and DecodeFunction could mean a lot of things. I'd call it URLDecodeFunction.
> Source/WTF/wtf/URLHelpers.h:39 > +WTF_EXPORT_PRIVATE String userVisibleString(CString URL);
const CString& URL?
> Source/WTF/wtf/cocoa/NSURLExtras.h:29 > +#import <wtf/text/WTFString.h>
Are you sure this is needed here? Can't it go in NSURLExtras.mm?
> Source/WTF/wtf/cocoa/NSURLExtras.mm:34 > +#import <wtf/URLHelpers.h>
This isn't sorted properly?
> Source/WTF/wtf/cocoa/NSURLExtras.mm:42 > #define URL_BYTES_BUFFER_LENGTH 2048
This is unused and can be deleted.
> Source/WTF/wtf/cocoa/NSURLExtras.mm:109 > + return !host ? string : (NSString*)host;
Above you used (NSString *)host, which I suspect is the right style here. Not sure. Any Apple developer should know, or you could poke around in the *.mm files more looking for the answer.
> Source/WTF/wtf/cocoa/NSURLExtras.mm:118 > + return !host ? string : (NSString*)host;
(NSString *)?
> Source/WTF/wtf/glib/URLHelpersGlib.cpp:39 > +void loadIDNScriptWhiteList() > +{ > + static std::once_flag flag; > + std::call_once(flag, initializeDefaultIDNScriptWhiteList); > +}
The Windows build failure is because there's no cross-platform implementation. And the WPE build failure is because you forgot to add this file to PlatformWPE.cmake. Well we need a cross-platform implementation because we have to care for JSCOnly, PlayStation, and Windows. And there's nothing GLib-specific here, so this can be that implementation, right? Just rename the file to URLHelpers.cpp, build it on all platforms in the cross-platform CMakeLists.txt (you can even add it to XCode build in case we want to add more functions here in the future), and slap an #if !PLATFORM(COCOA) around the definition here.
> Source/WebKit/PlatformGTK.cmake:106 > + ${WEBKIT_DIR}/UIProcess/API/gtk/WebKitURIUtilities.h
You remembered to install it AND add docs... nice.
> Tools/TestWebKitAPI/Tests/WebKitGLib/TestWebKitURIUtilities.cpp:112 > +void beforeAll() > +{ > + Test::add("WebKitURIUtilities", "uri-for-display-unaffected", testURIForDisplayUnaffected); > + Test::add("WebKitURIUtilities", "uri-for-display-affected", testURIForDisplayAffected); > +}
Although having a test for webkit_uri_for_display() is important, copying the entire thing from the cross-platform test is not valuable. The tests will get out of sync, anyway. Better to test the implementation details in the cross-platform test, since that will be run for all developers. Then the GLib test can become way simpler. E.g. just pass in a %-encoded URL and verify that the result is decoded. The GLib test can be simpler because you have a good cross-platform test.
Michael Catanzaro
Comment 98
2018-12-10 11:16:37 PST
Comment on
attachment 356804
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=356804&action=review
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:45 > +/** > + * webkit_uri_for_display: > + * @uri: the URI to be converted > + * > + * Use this function to format a URI for display. The URIs used internally by > + * WebKit may contain percent-encoded characters or Punycode, which are not > + * generally suitable to display to users. This function provides protection > + * against IDN homograph attacks, so in some cases the host part of the returned > + * URI may be in Punycode if the safety check fails. > + * > + * Returns: (nullable) (transfer full): @uri suitable for display, or %NULL in > + * case of error. > + **/
This is missing the since tag: Since: 2.24
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 99
2018-12-11 01:41:19 PST
Created
attachment 357043
[details]
Patch (In reply to Michael Catanzaro from
comment #97
)
> Comment on
attachment 356804
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=356804&action=review
> > > Source/WTF/wtf/URLHelpers.cpp:341 > > +#define CHECK_RULES_IF_SUFFIX_MATCHES(suffix, function) \ > > + { \ > > + static const int32_t suffixLength = sizeof(suffix) / sizeof(suffix[0]); \ > > + if (length > suffixLength && 0 == memcmp(buffer + length - suffixLength, suffix, sizeof(suffix))) \ > > + return isSecondLevelDomainNameAllowedByTLDRules(buffer, length - suffixLength, function); \ > > + } > > Eh, couldn't this become a function instead of a macro?
Not easily; you'd at least need a templated function to extract the length of the array, and you'd need to return a tri-state (continue / return true / return false), and your call site would need to become check + return. Not worth it, I think.
> > Source/WTF/wtf/cocoa/NSURLExtras.mm:42 > > #define URL_BYTES_BUFFER_LENGTH 2048 > > This is unused and can be deleted.
It's used in URLByTruncatingOneCharacterBeforeComponent().
> > Source/WebKit/PlatformGTK.cmake:106 > > + ${WEBKIT_DIR}/UIProcess/API/gtk/WebKitURIUtilities.h > > You remembered to install it AND add docs... nice.
I must admit I copied those parts from Gabriel's patch :)
EWS Watchlist
Comment 100
2018-12-11 01:44:51 PST
Comment hidden (obsolete)
Attachment 357043
[details]
did not pass style-queue: ERROR: Source/WTF/wtf/URLHelpers.h:44: The parameter name "string" adds no information, so it should be removed. [readability/parameter_name] [5] ERROR: Source/WTF/wtf/URLHelpers.h:45: The parameter name "string" adds no information, so it should be removed. [readability/parameter_name] [5] WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/webkit2.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitURIUtilities.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitURIUtilities.h" ERROR: Source/WTF/wtf/URLHelpers.cpp:43: HOST_NAME_BUFFER_LENGTH is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/WTF/wtf/URLHelpers.cpp:44: URL_BYTES_BUFFER_LENGTH is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/WTF/wtf/URLHelpers.cpp:116: A case label should not be indented, but line up with its switch statement. [whitespace/indent] [4] ERROR: Source/WTF/wtf/URLHelpers.cpp:347: Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons. [readability/comparison_to_zero] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:367: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:376: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:388: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:398: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:411: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:421: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:432: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:441: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:453: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:465: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:477: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:489: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:501: One space before end of line comments [whitespace/comments] [5] ERROR: Source/WTF/wtf/URLHelpers.cpp:515: string_ is incorrectly named. Don't use underscores in your identifier names. [readability/naming/underscores] [4] ERROR: Source/WTF/wtf/URLHelpers.cpp:526: One line control clauses should not use braces. [whitespace/braces] [4] ERROR: Source/WTF/wtf/URLHelpers.cpp:528: One line control clauses should not use braces. [whitespace/braces] [4] WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/webkit.h" Total errors found: 22 in 27 files If any of these errors are false positives, please file a bug against check-webkit-style.
Michael Catanzaro
Comment 101
2018-12-11 14:49:04 PST
Comment on
attachment 357043
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=357043&action=review
> Source/WTF/wtf/URLHelpers.h:40 > +WTF_EXPORT_PRIVATE String userVisibleString(const CString& URL);
Sorry for more nitpicking. Nitpicking's all I have left. Looking at how this is used in Internals.cpp, I realize it's a bit unclear: what's a user-visible string? It's clear in the context of this header, but outside of any class... I would add all of these to a new namespace, URLHelpers. The namespace will make it more clear. It's even more important for the non-exported functions, like WTF::mapHostName which is weird to have as a toplevel function. As WTF::URLHelpers::mapHostName it becomes more clear that the function is associated with these other URLHelpers functions. I would also rename userVisibleString to userVisibleURL to make it even more clear.
Michael Catanzaro
Comment 102
2018-12-11 14:49:30 PST
r=me if you placate the style checker. Since this patch adds new API, it requires approval of a second GTK reviewer. Carlos Garcia would be best, since he reviewed it previously.
Carlos Garcia Campos
Comment 103
2018-12-12 00:29:51 PST
Comment on
attachment 357043
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=357043&action=review
Thanks! API looks good to me, note that wpe now has documentation files too.
> Source/WTF/wtf/URLHelpers.cpp:514 > +// Return value of nil means no mapping is necessary. > +// If makeString is NO, then return value is either nil or self to indicate mapping is necessary. > +// If makeString is YES, then return value is either nil or the mapped string.
You should update this comment that still refers to nil and makeString that I don't know what it is.
> Source/WTF/wtf/URLHelpers.cpp:515 > +String mapHostName(const String& string_, const std::optional<URLDecodeFunction>& decodeFunction, bool& error)
string_ -> hostname
> Source/WTF/wtf/URLHelpers.cpp:528 > + if (decodeFunction && string.contains('%')) { > + string = (*decodeFunction)(string_); > + } else { > + string = string_; > + }
Remove the brackets here.
> Source/WTF/wtf/URLHelpers.cpp:539 > + error = true; > + return { };
Could we make this return std::optional<String> and then return std::nullopt instead of using an out bool parameter for error case.
> Source/WTF/wtf/URLHelpers.cpp:553 > +static void collectRangesThatNeedMapping(const String& string, unsigned location, unsigned length, MappingRangesVector& array, std::optional<URLDecodeFunction> decodeFunction)
should decodeFunction be const & like in the other functions?
> Source/WTF/wtf/URLHelpers.cpp:556 > + // Generally, we want to optimize for the case where there is one host name that does not need mapping. > + // Therefore, we use null to indicate no mapping here and an empty array to indicate error.
Can we use std::optional now instead of nullptr?
> Source/WTF/wtf/URLHelpers.cpp:730 > +static void createStringWithEscapedUnsafeCharacters(const Vector<UChar, URL_BYTES_BUFFER_LENGTH>& sourceBuffer, Vector<UChar, URL_BYTES_BUFFER_LENGTH>& outBuffer)
outBuffer could be the return value instead.
> Source/WTF/wtf/URLHelpers.cpp:765 > +String userVisibleString(const CString& URL)
I find weird that this function receives a CString but returns a String. URL -> url
> Source/WTF/wtf/URLHelpers.cpp:767 > + auto before = reinterpret_cast<const unsigned char*>(URL.data());
auto*
> Source/WTF/wtf/URLHelpers.cpp:864 > + Vector<UChar, URL_BYTES_BUFFER_LENGTH> outBuffer; > + createStringWithEscapedUnsafeCharacters(normalizedCharacters, outBuffer); > + return String::adopt(WTFMove(outBuffer));
Can we make createStringWithEscapedUnsafeCharacters return a String?
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:31 > + **/
*/
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:47 > + **/
*/
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:52 > + CString uri_string(uri);
uri_string -> uriString, but I don't think you need the local variable
> Source/WebKit/UIProcess/API/glib/WebKitURIUtilities.cpp:53 > + String result = WTF::userVisibleString(uri_string);
Just pass uri to userVisibleString() the CString will be created implicitly I think
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 104
2018-12-14 08:32:30 PST
Created
attachment 357316
[details]
Patch I think this addresses all outstanding comments.
WebKit Commit Bot
Comment 105
2018-12-17 06:08:55 PST
Comment on
attachment 357316
[details]
Patch Clearing flags on attachment: 357316 Committed
r239265
: <
https://trac.webkit.org/changeset/239265
>
WebKit Commit Bot
Comment 106
2018-12-17 06:08:58 PST
All reviewed patches have been landed. Closing bug.
Antoine Quint
Comment 107
2018-12-17 10:06:24 PST
I can no longer build WTF following this change: Source/WTF/wtf/URLHelpers.cpp:853:32: error: 'unorm_normalize' is deprecated [-Werror,-Wdeprecated-declarations] int32_t normalizedLength = unorm_normalize(sourceBuffer.data(), sourceBuffer.size(), UNORM_NFC, 0, normalizedCharacters.data(), sourceBuffer.size(), &uerror); ^
Keith Rollin
Comment 108
2018-12-17 10:37:32 PST
Ditto.
Alex Christensen
Comment 109
2018-12-17 10:58:27 PST
Comment on
attachment 357316
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=357316&action=review
> Source/WTF/wtf/URLHelpers.cpp:853 > + int32_t normalizedLength = unorm_normalize(sourceBuffer.data(), sourceBuffer.size(), UNORM_NFC, 0, normalizedCharacters.data(), sourceBuffer.size(), &uerror);
unorm_normalize is deprecated. If you absolutely need to use it, please only use it in GTK-specific code, but I think the deprecation is an indication that this is done the wrong way. This was also a behavior change, which I requested be avoided with this code move. This also broke an internal build, so I intend to roll it out.
WebKit Commit Bot
Comment 110
2018-12-17 11:18:23 PST
Re-opened since this is blocked by
bug 192765
Alex Christensen
Comment 111
2018-12-17 13:23:01 PST
If you try this again, please split it into two patches, one that does nothing but move code with no change in behavior or organization, then another that adds functionality for linux.
Michael Catanzaro
Comment 112
2018-12-17 14:09:53 PST
Comment on
attachment 357316
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=357316&action=review
>> Source/WTF/wtf/URLHelpers.cpp:853 >> + int32_t normalizedLength = unorm_normalize(sourceBuffer.data(), sourceBuffer.size(), UNORM_NFC, 0, normalizedCharacters.data(), sourceBuffer.size(), &uerror); > > unorm_normalize is deprecated. If you absolutely need to use it, please only use it in GTK-specific code, but I think the deprecation is an indication that this is done the wrong way. This was also a behavior change, which I requested be avoided with this code move. This also broke an internal build, so I intend to roll it out.
It should be using unorm2_normalize:
http://icu-project.org/apiref/icu4c/unorm2_8h.html#a0a596802db767da410b4b04cb75cbc53
. It's been deprecated since ICU 56 released in 2015, so Ms2ger you should have noticed a build warning, right? Unless you were working with a truly ancient host system? If it got lost in other warnings, then we might need to consider turning on -Werror for GTK. 😈 Why is normalization needed here but not in the original code? Is it a necessary consequence of the change from NSString to String? I'll water down Alex's request slightly, since we requested a bunch of small improvements to the organization of the code during many rounds of patch review, and trying to disentangle them and land separately could be a significant effort, and spending a bunch of time making the patch worse doesn't make sense. But any functional behavior changes in the moved code, like an extra normalization step, should be split out if feasible. And if not, they should be prominently mentioned in the ChangeLog, so we know what changed. Otherwise, it looks like burying a small important change in a huge code reorganization commit. And we don't want potential regressions to bisect down to a massive code move commit like this. If there are not other behavior changes in URLHelpers.[cpp,h], then splitting the rest into two patches should be easy, since we have a nice clean split between the URLHelpers/NSURLExtras code and the Linux-specific stuff in the layers above.
Alex Christensen
Comment 113
2018-12-17 17:32:17 PST
This is quite carefully crafted code that has evolved over many years. You may have requested a bunch of small improvements during review, but this functionality should not change at all on Apple platforms without an internally-reviewed security bug. You are welcome to use this code on Linux but please make no functional change, however convinced you may be that it is an improvement.
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 114
2018-12-18 01:04:04 PST
Created
attachment 357550
[details]
Patch The normalization is, AFAIK, a translation of the following code in ObjC: result = [result precomposedStringWithCanonicalMapping]; ... which can't really work in GTK. If you're interested in the individual changes, see
https://github.com/WebKit/webkit/compare/master...Ms2ger:userVisibleString?expand=1
EWS Watchlist
Comment 115
2018-12-18 01:07:38 PST
Comment hidden (obsolete)
Attachment 357550
[details]
did not pass style-queue: WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/webkit2.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitURIUtilities.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitURIUtilities.h" ERROR: Source/WTF/wtf/URLHelpers.cpp:853: Declaration has space between type name and * in const UNormalizer2 *normalizer [whitespace/declaration] [3] WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/webkit.h" Total errors found: 1 in 29 files If any of these errors are false positives, please file a bug against check-webkit-style.
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 116
2018-12-18 01:12:27 PST
Created
attachment 357551
[details]
Patch
Michael Catanzaro
Comment 117
2018-12-18 08:06:57 PST
So Ms2ger, there should be no functional changes, correct? You're just replacing use of the NSString function precomposedStringWithCanonicalMapping -- documented to return "[a] string made by normalizing the string’s contents using the Unicode Normalization Form C." -- with ICU's unorm2_normalize, documented to return "the normalized form of the source string"? If so, then I agree it's the minimum-required change to translate this from Objective C++ to C++. I would still split out all the changes from Source/WebKit and Tools/TestWebKitAPI into a separate patch, that way all the functional changes can go into a separate bug, even if they are Linux-specific. (In reply to Build Bot from
comment #115
)
> ERROR: Source/WTF/wtf/URLHelpers.cpp:853: Declaration has space between > type name and * in const UNormalizer2 *normalizer [whitespace/declaration] > [3]
Careful here, this needs fixed.
Michael Catanzaro
Comment 118
2018-12-18 08:16:04 PST
(In reply to Alex Christensen from
comment #113
)
> This is quite carefully crafted code that has evolved over many years. You > may have requested a bunch of small improvements during review, but this > functionality should not change at all on Apple platforms without an > internally-reviewed security bug. You are welcome to use this code on Linux > but please make no functional change, however convinced you may be that it > is an improvement.
Code style and organization improvements of the sort appropriate when translating a file from one language to another, not functionality changes. Well, you can see the history here up above. There shouldn't be behavior changes. Ms2ger, is there anything in here as risky as the switch from the NSString to the ICU normalization function? (We know in theory that should not result in behavior changes if both implement the Unicode normalization algorithm correctly, but in practice no doubt these functions will have different sets of bugs.) That change can be done in Objective C++, so if Alex wants to be really strict, he might ask us to do that change and other similar changes first in its own patch, then the code move patch, then the WPE/GTK API patch.
Michael Catanzaro
Comment 119
2018-12-18 08:24:58 PST
Unicode has four normalization forms:
https://unicode.org/reports/tr15/#Norm_Forms
Just for additional due diligence, I checked to ensure you're using unorm2_getNFCInstance(), which corresponds to form C, the same form as the NSString function, so it should be OK. I think we should go ahead and do that separately in its own initial patch, like I suggested above, since it can be done in Objective C++. That way, if that change does introduce unexpected problems, they'll bisect down to a nice separate commit and not the jumbo commit.
Alex Christensen
Comment 120
2018-12-18 09:37:16 PST
(In reply to Ms2ger (he/him; ⌚ UTC+1/+2) from
comment #114
)
> Created
attachment 357550
[details]
> Patch > > The normalization is, AFAIK, a translation of the following code in ObjC: > > result = [result precomposedStringWithCanonicalMapping];
This normalization must not change for Apple platforms. This call has important functionality that cannot be emulated by any one call to a unicode function. To make it work on linux, please add an abstraction for this operation.
Alex Christensen
Comment 121
2018-12-18 09:47:50 PST
(In reply to Michael Catanzaro from
comment #119
)
> Unicode has four normalization forms: > >
https://unicode.org/reports/tr15/#Norm_Forms
> > Just for additional due diligence, I checked to ensure you're using > unorm2_getNFCInstance(), which corresponds to form C, the same form as the > NSString function, so it should be OK. I think we should go ahead and do > that separately in its own initial patch, like I suggested above, since it > can be done in Objective C++. That way, if that change does introduce > unexpected problems, they'll bisect down to a nice separate commit and not > the jumbo commit.
Actually, you're right. unorm2_getNFCInstance is probably similar enough. A small patch would be great.
Michael Catanzaro
Comment 122
2018-12-20 19:38:21 PST
Note: the tests are duplicated with URLExtras.mm. The duplicated tests should be removed from the URLExtras.mm test now that they are added in the cross-platform test.
Michael Catanzaro
Comment 123
2018-12-22 16:42:40 PST
Also, in
bug #192945
Alex noticed a missing error check, should be: const UNormalizer2* normalizer = unorm2_getNFCInstance(&uerror); if (U_FAILURE(uerror)) return { };
Michael Catanzaro
Comment 124
2019-01-14 17:43:07 PST
Also, we need to not lose the recent change from
https://trac.webkit.org/changeset/239586/webkit
.
Michael Catanzaro
Comment 125
2019-01-14 19:26:50 PST
Also:
https://trac.webkit.org/changeset/239967/webkit
Michael Catanzaro
Comment 126
2019-01-15 13:26:54 PST
Two more requests: * Alex wants the unorm2_getNFCInstance normalization to be split into a subroutine, like it already is after
bug #192945
. * Darin has requested we not use String::charactersWithNullTermination here, because we don't need the null-termination:
> The idiom to efficiently get a UChar* buffer would be: > > auto characters = StringView { string }.upconvertedCharacters(); > > Then "characters" can be passed anywhere that needs a "const UChar*". We > should switch to that I think.
Michael Catanzaro
Comment 127
2019-01-15 19:29:19 PST
Darin has also suggested checking with unorm2_quickCheck() to see if normalization is likely required before performing the normalization. OK, maybe that will be enough suggested improvements? This bug has reached epic length!
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 128
2019-01-28 09:00:16 PST
Created
attachment 360348
[details]
Patch
EWS Watchlist
Comment 129
2019-01-28 09:02:19 PST
Attachment 360348
[details]
did not pass style-queue: WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/webkit2.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitURIUtilities.h" ERROR: Source/WTF/wtf/cocoa/NSURLExtras.mm:111: Use 'WTF::Optional<>' instead of 'std::optional<>'. [runtime/wtf_optional] [4] ERROR: Source/WTF/wtf/cocoa/NSURLExtras.mm:119: Use 'WTF::Optional<>' instead of 'std::optional<>'. [runtime/wtf_optional] [4] WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitURIUtilities.h" ERROR: Source/WTF/wtf/URLHelpers.cpp:786: Declaration has space between type name and * in const UNormalizer2 *normalizer [whitespace/declaration] [3] WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/webkit.h" Total errors found: 3 in 29 files If any of these errors are false positives, please file a bug against check-webkit-style.
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 130
2019-01-29 00:51:04 PST
Created
attachment 360451
[details]
Patch
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 131
2019-01-29 08:55:25 PST
Created
attachment 360462
[details]
Patch
EWS Watchlist
Comment 132
2019-01-29 08:57:35 PST
Attachment 360462
[details]
did not pass style-queue: WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/webkit2.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/gtk/WebKitURIUtilities.h" WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/WebKitURIUtilities.h" ERROR: Source/WTF/wtf/URLHelpers.cpp:786: Declaration has space between type name and * in const UNormalizer2 *normalizer [whitespace/declaration] [3] WARNING: File exempt from style guide. Skipping: "Source/WebKit/UIProcess/API/wpe/webkit.h" Total errors found: 1 in 29 files If any of these errors are false positives, please file a bug against check-webkit-style.
Michael Catanzaro
Comment 133
2019-01-29 09:48:26 PST
EWS is still pretty sad. (In reply to Build Bot from
comment #132
)
> ERROR: Source/WTF/wtf/URLHelpers.cpp:786: Declaration has space between > type name and * in const UNormalizer2 *normalizer [whitespace/declaration] > [3]
UNormalizer2* normalizer
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 134
2019-01-30 00:52:59 PST
Created
attachment 360561
[details]
Patch
Ms2ger (he/him; ⌚ UTC+1/+2)
Comment 135
2019-01-30 03:50:23 PST
Created
attachment 360567
[details]
Patch *sigh*
Alex Christensen
Comment 136
2019-01-30 08:54:57 PST
Comment on
attachment 360567
[details]
Patch Wow. I'll have to hear what Tim thinks of this patch.
Michael Catanzaro
Comment 137
2019-01-30 09:29:01 PST
Comment on
attachment 360567
[details]
Patch Note it's already been reviewed by Alex in... OMG,
comment #49
, that was over a year ago. :O Note also: API freeze is Monday, Feb 4, and this remains a rollout risk. Anyway, great to see EWS green. Checklist of changes since the first time it landed: * Avoid test duplication: fail, it seems the tests added in URLHelpers.cpp are still duplicates of the tests in URLExtras.mm. * Added missing error check after use of unorm2_getNFCInstance, good * Change in
r239586
has been ported, good * Change in
r239967
has been ported, good * unorm2_getNFCInstance normalization split into toNormalizationFormC, good * charactersWithNullTermination use removed from toNormalizationFormC... fail, that was missed. * Add unorm2_quickCheck(), also fail, that was missed too. So: Any tests we're adding to the cross-platform test should be removed from the platform-specific test. Ideally the entire test would become cross-platform, but I see that would require a bit more work and we're just days away from API freeze, so I understand if that can't happen now. This is minor, but Darin did request that we not use charactersWithNullTermination if we are just going to cut off the null termination right away. His suggested idiom was: auto characters = StringView { string }.upconvertedCharacters(); Were there any troubles with that? Then the unorm2_quickCheck should be one extra function call towards the top of toNormalizationFormC(), just to avoid calling unorm2_normalize() if it's not really needed. That could be done in a follow-up, of course, but it seems easy?
Michael Catanzaro
Comment 138
2019-02-01 09:21:56 PST
(In reply to Alex Christensen from
comment #136
)
> Comment on
attachment 360567
[details]
> Patch > > Wow. I'll have to hear what Tim thinks of this patch.
BTW please feel free to review today if you want to... don't want this to slip yet another release cycle. I'm giving r- for the duplicated tests, but it's still long past ready for reviews.
Michael Catanzaro
Comment 139
2019-02-01 09:23:05 PST
Another option, if you don't have time to make the tests cross-platform, is to keep them Cocoa-specific and we can just rely on them passing on the Apple bots to know that it's working. This is not the greatest solution, but it's better than duplicating the same list of tests in two different places.
Alex Christensen
Comment 140
2019-02-04 10:27:10 PST
Comment on
attachment 360567
[details]
Patch LGTM other than the test issue mentioned by Michael.
Michael Catanzaro
Comment 141
2019-02-04 13:15:23 PST
(In reply to Michael Catanzaro from
comment #137
)
> Any tests we're adding to the cross-platform test should be removed from the > platform-specific test. Ideally the entire test would become cross-platform, > but I see that would require a bit more work and we're just days away from > API freeze, so I understand if that can't happen now. > > This is minor, but Darin did request that we not use > charactersWithNullTermination if we are just going to cut off the null > termination right away. His suggested idiom was: > > auto characters = StringView { string }.upconvertedCharacters(); > > Were there any troubles with that? > > Then the unorm2_quickCheck should be one extra function call towards the top > of toNormalizationFormC(), just to avoid calling unorm2_normalize() if it's > not really needed. That could be done in a follow-up, of course, but it > seems easy?
I'm going to create a follow-up bug to handle all three of these issues, since they are all relatively minor. (The tests will take some work, but no harm in keeping the Cocoa tests for now.) Ms2ger, thanks for shepherding the patch so far!
Michael Catanzaro
Comment 142
2019-02-04 19:48:21 PST
This patch is the same as Ms2ger's, except I've removed Tools/TestWebKitAPI/Tests/WTF/URLHelpers.cpp. I will try to sort that out in the follow-up
bug #194272
.
Michael Catanzaro
Comment 143
2019-02-04 20:04:35 PST
Committed
r240962
: <
https://trac.webkit.org/changeset/240962
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug