Bug 166041

Summary: Record/replay: fix range used for fuzzy matching
Product: WebKit Reporter: Keith Rollin <krollin>
Component: WebKit Misc.Assignee: Keith Rollin <krollin>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, commit-queue, koivisto
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch none

Description Keith Rollin 2016-12-19 15:13:26 PST
Because of two bugs, the attempt to determine the range of URLs to check as part of the process of fuzzy matching was failing. The intent was to find the range of URLs that started with the same <scheme://host:port> as a given URL. However, because of a reversed test, the upper end of the range ended up being the "end()" iterator of the entire collection of URLs. With that fixed, there was another bug due to one URL being given as <scheme://host:port> and the other given as <scheme://host:port/> (note the trailing slash).
Comment 1 Keith Rollin 2016-12-19 18:23:03 PST
Created attachment 297496 [details]
Patch
Comment 2 Ryan Haddad 2016-12-19 19:01:56 PST
The mac-ews failure may be my fault. I was taking bots offline to update them.
Comment 3 Keith Rollin 2016-12-20 16:21:59 PST
Yeah, there was the following build error:

Extractor.cpp:26:
/Volumes/Data/EWS/WebKit/Source/JavaScriptCore/config.h:26:10: fatal error: 'wtf/Platform.h' file not found
#include <wtf/Platform.h>

There's no way my patch led to that. So the patch should still be reviewable as-is.
Comment 4 Darin Adler 2016-12-22 09:22:50 PST
Comment on attachment 297496 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=297496&action=review

Is there a way we can add regression tests for this? I am concerned about fixing bugs and adding code without regression test coverage, even in diagnostic code.

> Source/WebKit2/NetworkProcess/capture/NetworkCaptureManager.cpp:427
> +    auto pathStart = url.pathStart();
> +    auto baseURLStr = url.string().left(pathStart);
> +    WebCore::URLParser parser(baseURLStr);
> +    return parser.result();

Seems to me that this should be a function in URL or URLParser itself, not something in the network capture code. I don’t think the name “base URL” is right for this, since that is a term that means something specific in the context of URLs. It’s more like topLevelOfServerURL; it would be good to use a more precise term in the name here.
Comment 5 Alex Christensen 2016-12-22 11:11:58 PST
Comment on attachment 297496 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=297496&action=review

>> Source/WebKit2/NetworkProcess/capture/NetworkCaptureManager.cpp:427
>> +    return parser.result();
> 
> Seems to me that this should be a function in URL or URLParser itself, not something in the network capture code. I don’t think the name “base URL” is right for this, since that is a term that means something specific in the context of URLs. It’s more like topLevelOfServerURL; it would be good to use a more precise term in the name here.

This should definitely be in URL.h and URL.cpp.  And we shouldn't need to reprise the substring.  It looks like you just want to include the '/' after the host if there is one.  There will always be one if the scheme is something like http, but an unknown scheme without a '/' like "asdf://noSlash" won't have one.
Comment 6 Keith Rollin 2016-12-22 12:08:13 PST
(In reply to comment #4)
> Is there a way we can add regression tests for this? I am concerned about
> fixing bugs and adding code without regression test coverage, even in
> diagnostic code.

I'll file a bug and consult with others after the break. It's not clear to me how this could/should be tested.

> Seems to me that this should be a function in URL or URLParser itself, not
> something in the network capture code.

OK, I'll file another bug for that. Two bugs, I guess: one to add the feature to URL, the other to use it in Network Capture.
Comment 7 Keith Rollin 2017-01-05 18:26:45 PST
The bug for adding test coverage is Bug 166427. The bug for extending URL is Bug 166426. The patch in this bug has been rebased to use that new interface.
Comment 8 Keith Rollin 2017-01-05 18:28:13 PST
Created attachment 298164 [details]
Patch
Comment 9 Keith Rollin 2017-01-05 18:29:19 PST
I'm taking that last patch down. Forgot to actually, you know, test it.
Comment 10 Keith Rollin 2017-01-09 15:09:07 PST
Created attachment 298398 [details]
Patch
Comment 11 Alex Christensen 2017-01-09 16:05:57 PST
Comment on attachment 298398 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=298398&action=review

> Source/WebKit2/NetworkProcess/capture/NetworkCaptureManager.cpp:423
> +    return WebCore::URLParser(url.protocolHostAndPort()).result();

Can we call the URL constructor instead of creating a URLParser (which should only be created by the URL constructor)?

> Source/WebKit2/NetworkProcess/capture/NetworkCaptureResource.h:66
>      std::optional<WebCore::URL> m_url;
> -    std::optional<WebCore::URL> m_baseURL;
> +    std::optional<WebCore::URL> m_urlIdentifyingCommonDomain;

optional<URL> is kind of strange.  A URL is already optional.  isValid() says if it contains a valid URL, and if it doesn't then it is like nullopt with a String we should only be using for things like the anchor tag.
Comment 12 Keith Rollin 2017-01-09 17:10:35 PST
(In reply to comment #11)
> Comment on attachment 298398 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=298398&action=review
> 
> > Source/WebKit2/NetworkProcess/capture/NetworkCaptureManager.cpp:423
> > +    return WebCore::URLParser(url.protocolHostAndPort()).result();
> 
> Can we call the URL constructor instead of creating a URLParser (which
> should only be created by the URL constructor)?

OK.

> > Source/WebKit2/NetworkProcess/capture/NetworkCaptureResource.h:66
> >      std::optional<WebCore::URL> m_url;
> > -    std::optional<WebCore::URL> m_baseURL;
> > +    std::optional<WebCore::URL> m_urlIdentifyingCommonDomain;
> 
> optional<URL> is kind of strange.  A URL is already optional.  isValid()
> says if it contains a valid URL, and if it doesn't then it is like nullopt
> with a String we should only be using for things like the anchor tag.

Antti suggested using Optional in https://bugs.webkit.org/show_bug.cgi?id=164527#c9:

>> Source/WebKit2/NetworkProcess/capture/NetworkCaptureCachedResource.h:87
>> +    // Lazily calculated.
>> +    WebCore::URL m_url;
>> +    WebCore::URL m_baseURL;
>> +    WebCore::URLParser::URLEncodedForm m_queryParameters;
>> +
>> +    bool m_haveURL { false };
>> +    bool m_haveBaseURL { false };
>> +    bool m_haveQueryParameters { false };
> 
> Often this sort of bool+field pairs end up reading better with Optional:
> 
> Optional<WebCore::URL> m_url;

I'm OK with changing it, though. Your way makes sense.
Comment 13 Keith Rollin 2017-01-10 12:32:26 PST
(In reply to comment #12)
> (In reply to comment #11)
> > Comment on attachment 298398 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=298398&action=review
> > 
> > > Source/WebKit2/NetworkProcess/capture/NetworkCaptureManager.cpp:423
> > > +    return WebCore::URLParser(url.protocolHostAndPort()).result();
> > 
> > Can we call the URL constructor instead of creating a URLParser (which
> > should only be created by the URL constructor)?
> 
> OK.

Actually, if we parse via the URL c'tor, I think we'll run afoul of that ASSERT in it. I'll be passing in a string that doesn't contain a trailing slash, but the parsed-then-stringified output will have that slash, and the ASSERT will trigger.

I think I'll just skip using URL's in this context and use strings. I was using a URL here for consistency between url() and baseURL(). But since that consistency comes at the expense of a lot of unnecessary parsing, it might be best to forego it. I'll just use a String here (and use isNull instead of wrapping in an std::optional).
Comment 14 Keith Rollin 2017-01-10 13:24:41 PST
Created attachment 298507 [details]
Patch
Comment 15 WebKit Commit Bot 2017-01-10 14:32:08 PST
Comment on attachment 298507 [details]
Patch

Clearing flags on attachment: 298507

Committed r210561: <http://trac.webkit.org/changeset/210561>
Comment 16 WebKit Commit Bot 2017-01-10 14:32:13 PST
All reviewed patches have been landed.  Closing bug.