Bug 237068 - Add a URL constructor that takes a String
Summary: Add a URL constructor that takes a String
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Chris Dumez
URL:
Keywords: InRadar
Depends on:
Blocks: 237099
  Show dependency treegraph
 
Reported: 2022-02-22 17:01 PST by Chris Dumez
Modified: 2022-02-24 13:30 PST (History)
5 users (show)

See Also:


Attachments
Patch (35.36 KB, patch)
2022-02-22 17:05 PST, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (40.47 KB, patch)
2022-02-22 20:08 PST, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (40.47 KB, patch)
2022-02-22 21:43 PST, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Dumez 2022-02-22 17:01:34 PST
Add a URL constructor that takes a String to simplify our code base a bit.
Comment 1 Chris Dumez 2022-02-22 17:05:10 PST
Created attachment 452919 [details]
Patch
Comment 2 Darin Adler 2022-02-22 19:06:35 PST
Comment on attachment 452919 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452919&action=review

> Source/WTF/wtf/URL.h:71
> +    explicit URL(const String& urlString)
> +        : URL(URL(), urlString)
> +    {
> +    }

Is the purpose of this so obvious that it doesn’t even need to be stated in a comment? The constructor above is quite explicit.

Do we also want to point out that sometimes we would not want to just parse a string as a URL with no base URL without some additional processing?

Should we also have an overload that takes a String&& so we can save reference count churn in common cases where we end up just keeping the string unmodified?

> Source/WTF/wtf/URL.h:321
> +    return URL(WTFMove(*string));

This code is written as if we had a version that takes an rvalue reference.

> Source/WebCore/Modules/websockets/WebSocket.cpp:226
> +    m_url = URL { url };

Don’t think we need to write URL here, just { url } will do. You of course are welcome to keep the URL if you think it helps readability. Or it’s possible I am wrong because of the use of "explicit"?

> Source/WebCore/bindings/js/ScriptModuleLoader.cpp:226
> +    return URL { asString(moduleKeyValue)->value(&jsGlobalObject) };

No need for URL here, just braces will do.

> Source/WebCore/bindings/js/ScriptModuleLoader.cpp:331
> +        baseURL = URL { sourceOrigin.string() };

No need for URL here, just braces will do.

> Source/WebCore/bindings/js/SerializedScriptValue.cpp:2567
> +        file = File::deserialize(executionContext(m_lexicalGlobalObject), filePath, URL { url->string() }, type->string(), name->string(), optionalLastModified);

No need for URL here, just braces will do.

> Source/WebCore/bindings/js/SerializedScriptValue.cpp:3652
> +            return getJSValue(Blob::deserialize(executionContext(m_lexicalGlobalObject), URL { url->string() }, type->string(), size, blobFilePathForBlobURL(url->string())).get());

No need for URL here, just braces will do.

> Source/WebCore/contentextensions/ContentExtensionActions.cpp:385
> +        request.setURL(URL { action.url });

No need for URL here, just braces will do.

> Source/WebCore/dom/Document.cpp:5342
> +                m_referrerOverride = URL(referrerURL.protocolHostAndPort()).string();

I am unclear what this code does. I am pretty sure that converting to a URL and back to a string is a round trip no-op that does nothing. Sometimes makes a valid URL, other times makes an invalid URL, but never changes the string. Am I wrong about this?

> Source/WebCore/loader/DocumentLoader.cpp:852
> +    return URL { "https://www.microsoft.com/en-us/microsoft-365/microsoft-teams/"_s };

No need for URL here, just braces will do.

> Source/WebCore/loader/archive/cf/LegacyWebArchive.cpp:228
> +    return ArchiveResource::create(SharedBuffer::create(resourceData), URL { url }, mimeType, textEncoding, frameName, response);

No need for URL here, just braces will do.

> Source/WebCore/page/DOMWindow.cpp:2586
> +            auto radioPlayerDomain = RegistrableDomain(URL { Quirks::staticRadioPlayerURLString() });
> +            auto BBCDomain = RegistrableDomain(URL { Quirks::BBCRadioPlayerURLString() });

Not sure we need URL here, likely just braces will do.

> Source/WebCore/page/Quirks.cpp:1101
> +    static NeverDestroyed<RegistrableDomain> BBCDomain = RegistrableDomain(URL { Quirks::BBCRadioPlayerURLString() });

Ditto.

> Source/WebCore/page/Quirks.cpp:1175
> +    static NeverDestroyed<URL> kinjaURL = URL { "https://kinja.com" };

Could just write NeverDestroyed here instead of NeverDestroyed<URL>.

> Source/WebCore/page/SecurityOrigin.cpp:593
> +    return SecurityOrigin::create(URL { originString });

Not sure we need URL here, likely just braces will do.

> Source/WebCore/page/SecurityOrigin.cpp:599
> +    auto origin = create(URL { protocol + "://" + host + "/" });

Not sure we need URL here, likely just braces will do.

> Source/WebCore/page/SecurityOriginData.cpp:54
> +    return URL { toString() };

Don’t need URL here, just braces will do.

> Source/WebCore/page/SecurityPolicy.cpp:95
> +    ASSERT(referrer == URL(referrer).strippedForUseAsReferrer()
> +        || referrer == SecurityOrigin::create(URL { referrer })->toString());

Should write this the same way on these two successive lines?

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:925
> +    m_url = URL { cleanURLString };

No need for URL here, just braces will do.

> Source/WebCore/platform/network/cf/ResourceErrorCF.cpp:123
> +            m_failingURL = URL { failingURLString };

No need for URL here, just braces will do.

> Source/WebCore/platform/network/curl/CurlRequest.cpp:369
> +        m_response.proxyUrl = URL { *proxyUrl };

No need for URL here, just braces will do. Also surprised they named it proxyUrl and not proxyURL.

> Source/WebCore/platform/network/mac/ResourceErrorMac.mm:163
> +        m_failingURL = URL { failingURLString };

No need for URL here, just braces will do.

> Source/WebCore/platform/win/PasteboardWin.cpp:455
> +        WebCore::writeURL(m_writableDataObject.get(), URL { data }, String(), false, true);

Might not need URL here, maybe just braces will do.

> Source/WebCore/testing/Internals.cpp:5691
> +    frame->loader().client().sendH2Ping(URL { url }, [promise = WTFMove(promise)] (Expected<Seconds, ResourceError>&& result) mutable {

Ditto.
Comment 3 Chris Dumez 2022-02-22 19:43:22 PST
(In reply to Darin Adler from comment #2)
> Comment on attachment 452919 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=452919&action=review
> 
> > Source/WTF/wtf/URL.h:71
> > +    explicit URL(const String& urlString)
> > +        : URL(URL(), urlString)
> > +    {
> > +    }
> 
> Is the purpose of this so obvious that it doesn’t even need to be stated in
> a comment? The constructor above is quite explicit.
> 
> Do we also want to point out that sometimes we would not want to just parse
> a string as a URL with no base URL without some additional processing?
> 
> Should we also have an overload that takes a String&& so we can save
> reference count churn in common cases where we end up just keeping the
> string unmodified?
> 
> > Source/WTF/wtf/URL.h:321
> > +    return URL(WTFMove(*string));
> 
> This code is written as if we had a version that takes an rvalue reference.
> 
> > Source/WebCore/Modules/websockets/WebSocket.cpp:226
> > +    m_url = URL { url };
> 
> Don’t think we need to write URL here, just { url } will do. You of course
> are welcome to keep the URL if you think it helps readability. Or it’s
> possible I am wrong because of the use of "explicit"?
> 
> > Source/WebCore/bindings/js/ScriptModuleLoader.cpp:226
> > +    return URL { asString(moduleKeyValue)->value(&jsGlobalObject) };
> 
> No need for URL here, just braces will do.
> 
> > Source/WebCore/bindings/js/ScriptModuleLoader.cpp:331
> > +        baseURL = URL { sourceOrigin.string() };
> 
> No need for URL here, just braces will do.
> 
> > Source/WebCore/bindings/js/SerializedScriptValue.cpp:2567
> > +        file = File::deserialize(executionContext(m_lexicalGlobalObject), filePath, URL { url->string() }, type->string(), name->string(), optionalLastModified);
> 
> No need for URL here, just braces will do.
> 
> > Source/WebCore/bindings/js/SerializedScriptValue.cpp:3652
> > +            return getJSValue(Blob::deserialize(executionContext(m_lexicalGlobalObject), URL { url->string() }, type->string(), size, blobFilePathForBlobURL(url->string())).get());
> 
> No need for URL here, just braces will do.
> 
> > Source/WebCore/contentextensions/ContentExtensionActions.cpp:385
> > +        request.setURL(URL { action.url });
> 
> No need for URL here, just braces will do.
> 
> > Source/WebCore/dom/Document.cpp:5342
> > +                m_referrerOverride = URL(referrerURL.protocolHostAndPort()).string();
> 
> I am unclear what this code does. I am pretty sure that converting to a URL
> and back to a string is a round trip no-op that does nothing. Sometimes
> makes a valid URL, other times makes an invalid URL, but never changes the
> string. Am I wrong about this?

The purpose of the code is to extract the origin out of the URL string (only protocol / host / port, not the rest of the URL).

Maybe this would be clearer?
m_referrerOverride = SecurityOriginData::fromURL(referrerURL).toString();
Comment 4 Chris Dumez 2022-02-22 19:46:01 PST
(In reply to Darin Adler from comment #2)
> Comment on attachment 452919 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=452919&action=review
> 
> > Source/WTF/wtf/URL.h:71
> > +    explicit URL(const String& urlString)
> > +        : URL(URL(), urlString)
> > +    {
> > +    }
> 
> Is the purpose of this so obvious that it doesn’t even need to be stated in
> a comment? The constructor above is quite explicit.
> 
> Do we also want to point out that sometimes we would not want to just parse
> a string as a URL with no base URL without some additional processing?
> 
> Should we also have an overload that takes a String&& so we can save
> reference count churn in common cases where we end up just keeping the
> string unmodified?
> 
> > Source/WTF/wtf/URL.h:321
> > +    return URL(WTFMove(*string));
> 
> This code is written as if we had a version that takes an rvalue reference.
> 
> > Source/WebCore/Modules/websockets/WebSocket.cpp:226
> > +    m_url = URL { url };
> 
> Don’t think we need to write URL here, just { url } will do. You of course
> are welcome to keep the URL if you think it helps readability. Or it’s
> possible I am wrong because of the use of "explicit"?

It doesn't build:
./Modules/websockets/WebSocket.cpp:226:11: error: no viable overloaded '='
    m_url = { url };
Comment 5 Chris Dumez 2022-02-22 19:57:04 PST
I am going to apply the review comments that I can but most of them are about using { } instead of URL { } and it doesn't appear to build in any of the cases (likely because the constructor is explicit). I do think we want it to be explicit, given that parsing a String as a URL is fairly expensive).
Comment 6 Darin Adler 2022-02-22 20:07:25 PST
Agreed. Let’s leave it explicit.
Comment 7 Chris Dumez 2022-02-22 20:08:11 PST
Created attachment 452935 [details]
Patch
Comment 8 Chris Dumez 2022-02-22 21:40:07 PST
Comment on attachment 452935 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452935&action=review

> Source/WebCore/dom/Document.cpp:-5342
> -                m_referrerOverride = URL(URL(), referrerURL.protocolHostAndPort()).string();

`URL(URL(), referrerURL.protocolHostAndPort()).string()` or `referrerURL.protocolHostAndPort()` give:
http://127.0.0.1:8000

Pre-existing code was giving:
http://127.0.0.1:8000/

I checked in Chrome and I indeed see a trailing slash there too. I'll keep the existing logic to avoid breakage (2 tests did fail).
Comment 9 Chris Dumez 2022-02-22 21:43:55 PST
Created attachment 452936 [details]
Patch
Comment 10 EWS 2022-02-22 23:20:02 PST
Committed r290350 (247668@main): <https://commits.webkit.org/247668@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 452936 [details].
Comment 11 Radar WebKit Bug Importer 2022-02-22 23:21:18 PST
<rdar://problem/89339510>
Comment 12 Alex Christensen 2022-02-24 13:30:38 PST
Comment on attachment 452936 [details]
Patch

A few years ago I tried this and ran into some issues but this can now be done safely since URL doesn't have an operator String.
This should probably eventually be a step towards the only constructor for a URL taking a String, a URL with a default parameter of an empty URL for the base URL, and an encoding with a default parameter of null.  To do that, we would have to remove lots of URL() calls in WebKit.