Bug 231410 - Scroll To Text Fragment directive parsing
Summary: Scroll To Text Fragment directive parsing
Status: REOPENED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Megan Gardner
URL:
Keywords: InRadar
Depends on: 231870 231871
Blocks:
  Show dependency treegraph
 
Reported: 2021-10-07 21:19 PDT by Megan Gardner
Modified: 2021-10-16 18:52 PDT (History)
9 users (show)

See Also:


Attachments
Patch (15.70 KB, patch)
2021-10-07 21:24 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (15.70 KB, patch)
2021-10-08 09:09 PDT, Megan Gardner
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (15.73 KB, patch)
2021-10-08 09:34 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (15.75 KB, patch)
2021-10-08 10:26 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (15.74 KB, patch)
2021-10-08 12:58 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (16.07 KB, patch)
2021-10-08 19:54 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (16.17 KB, patch)
2021-10-11 12:44 PDT, Megan Gardner
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (16.18 KB, patch)
2021-10-11 14:59 PDT, Megan Gardner
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (16.27 KB, patch)
2021-10-11 20:03 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (16.28 KB, patch)
2021-10-12 11:26 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (16.09 KB, patch)
2021-10-12 13:25 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (16.09 KB, patch)
2021-10-12 14:57 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (16.32 KB, patch)
2021-10-12 16:36 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (16.38 KB, patch)
2021-10-13 15:04 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (16.47 KB, patch)
2021-10-13 15:36 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch (16.49 KB, patch)
2021-10-13 15:49 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff
Patch for landing (16.46 KB, patch)
2021-10-13 18:12 PDT, Megan Gardner
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Megan Gardner 2021-10-07 21:19:21 PDT
Scroll To Text Fragment directive parsing
Comment 1 Megan Gardner 2021-10-07 21:24:21 PDT
Created attachment 440570 [details]
Patch
Comment 2 Tim Horton 2021-10-07 21:42:17 PDT
Comment on attachment 440570 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440570&action=review

> Source/WebCore/dom/FragmentDirectiveParser.cpp:62
> +    m_URLFragment = fragmentIdentifier.substring(0, fragmentDirectiveStart);

I would call this like "remaining fragment" or something. Also the comment should have a capital/punctuation.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:84
> +    LOG_WITH_STREAM(Scrolling, stream << *this << " parseFragmentDirective: ");

I think this probably should not use the Scrolling log channel

> Source/WebCore/dom/FragmentDirectiveParser.cpp:88
> +        // FIXME: add decoding for % encoded strings.

This is fairly critical, because it comes up as soon as you have a space in the string :)

Oddly the decoding code that decodeURIComponent is inline in JSGlobalObjectFunctions (and the URL parser's is similarly internal to it), so it's not clear what the right way to do this is.

> Source/WebCore/dom/FragmentDirectiveParser.h:45
> +class FragmentDirectiveParser : public RefCounted<FragmentDirectiveParser> {

Unclear why this is refcounted, probably just a plain class?

> Source/WebCore/page/FrameView.cpp:2224
> +    }

I'm guessing you either want to bail here, or pass the fragment parser's "remaining fragment" on down below, but likely /not/ fall through like this.
Comment 3 Megan Gardner 2021-10-08 09:09:07 PDT
Created attachment 440628 [details]
Patch
Comment 4 Megan Gardner 2021-10-08 09:34:22 PDT
Created attachment 440632 [details]
Patch
Comment 5 Megan Gardner 2021-10-08 10:26:16 PDT
Created attachment 440638 [details]
Patch
Comment 6 Megan Gardner 2021-10-08 12:58:51 PDT
Created attachment 440658 [details]
Patch
Comment 7 Megan Gardner 2021-10-08 19:54:11 PDT
Created attachment 440699 [details]
Patch
Comment 8 Megan Gardner 2021-10-11 12:44:40 PDT
Created attachment 440823 [details]
Patch
Comment 9 Megan Gardner 2021-10-11 14:59:00 PDT
Created attachment 440843 [details]
Patch
Comment 10 Alex Christensen 2021-10-11 15:07:44 PDT
Comment on attachment 440843 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440843&action=review

Initial glance through seems reasonable, needs some polish and tests.

> Source/WebCore/dom/Document.h:1631
> +    String& fragmentDirective() { return m_fragmentDirective; }

This one should probably be removed.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:37
> +    String fragmentDirectiveDelimiter = ":~:"_s;

ASCIILiteral, which would prevent an allocation.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:40
> +    m_isValid = false;

This should probably be done with an initializer list in the header
bool m_isValid { false };

> Source/WebCore/dom/FragmentDirectiveParser.cpp:79
> +    auto directives = fragmentDirective.split("&");

split('&')

> Source/WebCore/dom/FragmentDirectiveParser.h:32
> +class URL;

This should probably be replaced by wtf/Forward.h, which also has "using WTF::URL"

> Source/WebCore/dom/FragmentDirectiveParser.h:52
> +    String fragmentDirective() { return m_fragmentDirective; };
> +    StringView remainingURLFragment() { return m_remainingURLFragment; };

const String& fragmentDirective() const
StringView remainingURLFragment() const

> Source/WebCore/dom/FragmentDirectiveParser.h:57
> +    bool parseFragmentDirective(String fragmentDirective);

Can this take a StringView instead?
Comment 11 Chris Dumez 2021-10-11 15:17:50 PDT
Comment on attachment 440843 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440843&action=review

> Source/WebCore/dom/FragmentDirectiveParser.cpp:68
> +bool FragmentDirectiveParser::parseFragmentDirective(String fragmentDirective)

I agree with Alex that this should take in a StringView, but even if it took at String, we would not pass it by value.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:83
> +    for (auto directive : directives) {

auto&

> Source/WebCore/dom/FragmentDirectiveParser.cpp:90
> +        for (auto token : tokens) {

auto&

> Source/WebCore/dom/FragmentDirectiveParser.cpp:99
> +            tokens[0].truncate(tokens[0].length()-2);

missing spaces around -

What if the token is '-' ? (would result in `1 - 2`)

> Source/WebCore/dom/FragmentDirectiveParser.h:46
> +    WTF_MAKE_FAST_ALLOCATED(FragmentDirectiveParser);

Not needed since you subclass RefCounted.

> Source/WebCore/dom/FragmentDirectiveParser.h:48
> +    FragmentDirectiveParser(const URL&);

explicit?

> Source/WebCore/dom/FragmentDirectiveParser.h:50
> +    Vector<ParsedTextDirective> parsedTextDirectives() { return m_parsedTextDirectives; };

I don't think you mean to copy the vector here? Should probably be const and return a `const Vector<>&`

> Source/WebCore/dom/FragmentDirectiveParser.h:51
> +    String fragmentDirective() { return m_fragmentDirective; };

ditto.

> Source/WebCore/dom/FragmentDirectiveParser.h:53
> +    bool isValid() { return  m_isValid; };

should be const.

> Source/WebCore/page/FrameView.cpp:2218
> +        auto fragmentDirectiveParser = FragmentDirectiveParser(url);

FragmentDirectiveParser fragmentDirectiveParser(url); seems shorter.
Comment 12 Tim Horton 2021-10-11 15:30:32 PDT
Comment on attachment 440843 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440843&action=review

>> Source/WebCore/dom/FragmentDirectiveParser.h:46
>> +    WTF_MAKE_FAST_ALLOCATED(FragmentDirectiveParser);
> 
> Not needed since you subclass RefCounted.

Oh, we discussed earlier not making this thing refcounted (see the public constructor and lack of ::create), but I think Megan missed removing the inheritance from RefCounted.
Comment 13 Chris Dumez 2021-10-11 15:32:01 PDT
Comment on attachment 440843 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440843&action=review

>>> Source/WebCore/dom/FragmentDirectiveParser.h:46
>>> +    WTF_MAKE_FAST_ALLOCATED(FragmentDirectiveParser);
>> 
>> Not needed since you subclass RefCounted.
> 
> Oh, we discussed earlier not making this thing refcounted (see the public constructor and lack of ::create), but I think Megan missed removing the inheritance from RefCounted.

Good point. Having a public constructor AND subclassing RefCounted<> is definitely wrong :)
Comment 14 Megan Gardner 2021-10-11 15:33:09 PDT
Yes, I didn't remove the RefCounted bit, will fix shortly.
Comment 15 Megan Gardner 2021-10-11 20:03:59 PDT
Created attachment 440874 [details]
Patch
Comment 16 Megan Gardner 2021-10-12 11:26:34 PDT
Created attachment 440962 [details]
Patch
Comment 17 Tim Horton 2021-10-12 11:33:52 PDT
Comment on attachment 440874 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440874&action=review

> Source/WebCore/ChangeLog:11
> +        * Sources.txt:

This whole section could use more words.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:29
> +#import "Logging.h"

#include to fix the windows build

> Source/WebCore/dom/FragmentDirectiveParser.cpp:57
> +    // FIXME: this needs to be set on the original URL

Complete sentences with capitals and punctuations! Also I think it is a lie, you can't modify the loaded URL; the real thing you want to do is expose the de-directived fragment to other things in WebCore that use the fragment (e.g. normal scroll-to-fragment)... which it looks like you already do. Should talk to Alex.

> Source/WebCore/page/FrameView.cpp:2225
> +            

Shouldn't this `return` here?
Comment 18 Chris Dumez 2021-10-12 11:43:29 PDT
Comment on attachment 440962 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440962&action=review

> Source/WebCore/dom/FragmentDirectiveParser.cpp:61
> +    parseFragmentDirective(fragmentDirectiveString);

Should pass fragmentDirective directly here. Going StringView -> String -> StringView is wrong.

> Source/WebCore/dom/FragmentDirectiveParser.h:43
> +    FragmentDirectiveParser(const URL&);

explicit

> Source/WebCore/dom/FragmentDirectiveParser.h:52
> +    bool parseFragmentDirective(StringView fragmentDirective);

parameter name can be omitted.
Comment 19 Wenson Hsieh 2021-10-12 12:03:05 PDT
Comment on attachment 440962 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440962&action=review

Looks reasonable to me! Just a few minor comments.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:41
> +    if (!fragmentIdentifier || fragmentIdentifier.isEmpty()) {

Nit - just `if (fragmentIdentifier.isEmpty())` is sufficient here, since the null string is considered empty.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:42
> +        m_fragmentDirective = String();

Nit - I think you can just omit this line, since `m_fragmentDirective` is already initialized to the null string.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:50
> +        m_fragmentDirective = String();

(Ditto)

> Source/WebCore/dom/FragmentDirectiveParser.cpp:58
> +    m_remainingURLFragment = fragmentIdentifier.substring(0, fragmentDirectiveStart);

Nit - `m_remainingURLFragment = fragmentIdentifier.left(fragmentDirectiveStart);` might be a tiny bit cleaner.

> Source/WebCore/page/FrameView.cpp:2213
> +    auto& document = *frame().document();

Nit - this should be a Ref or RefPtr.
Comment 20 Megan Gardner 2021-10-12 13:25:52 PDT
Created attachment 440975 [details]
Patch
Comment 21 Megan Gardner 2021-10-12 14:57:16 PDT
Created attachment 440989 [details]
Patch
Comment 22 Megan Gardner 2021-10-12 16:36:59 PDT
Created attachment 441016 [details]
Patch
Comment 23 Chris Dumez 2021-10-13 11:53:20 PDT
Comment on attachment 441016 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441016&action=review

r=me with changes.

> Source/WebCore/dom/Document.h:2225
> +    StringView m_fragmentDirective;

It doesn't seem right of have a StringView data member on the Document. What guarantees that the underlying String stays alive?

We probably want to use String here and `const String&` in the getter/setter.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:73
> +    fragmentDirective = fragmentDirective.substring(textDirective.length(), fragmentDirective.length());

Do you really need to specify the second parameter? Seems you want everything after the `text=` prefix.

> Source/WebCore/dom/FragmentDirectiveParser.h:29
> +#include <wtf/IsoMalloc.h>

Should not be needed.

> Source/WebCore/dom/FragmentDirectiveParser.h:41
> +    WTF_MAKE_FAST_ALLOCATED(FragmentDirectiveParser);

Should not be needed since you only stack-allocate it.

> Source/WebCore/dom/FragmentDirectiveParser.h:45
> +    const Vector<ParsedTextDirective>& parsedTextDirectives() { return m_parsedTextDirectives; };

getter should be marked as const.

> Source/WebCore/page/FrameView.cpp:2224
> +            // TODO: Scroll to the range specified by the directive

Per coding style (https://webkit.org/code-style-guidelines/#comments), this should be FIXME, not TODO.
Comment 24 Alex Christensen 2021-10-13 12:11:27 PDT
Comment on attachment 441016 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441016&action=review

>> Source/WebCore/dom/Document.h:2225
>> +    StringView m_fragmentDirective;
> 
> It doesn't seem right of have a StringView data member on the Document. What guarantees that the underlying String stays alive?
> 
> We probably want to use String here and `const String&` in the getter/setter.

Definitely don't store a StringView on the Document.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:81
> +        Vector<String> tokens;
> +        for (auto token : directive.split(','))
> +            tokens.append(token.toString());

Could this be a Vector<StringView>?
nit: I think WTF::map could be used here to be more concise.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:98
> +            tokens.remove(0);

Removing the first element of a Vector isn't efficient.  Could tokens be a Deque instead?

> Source/WebCore/dom/FragmentDirectiveParser.cpp:107
> +        if (tokens.size() != 1 && tokens.size() != 2) {
> +            LOG_WITH_STREAM(TextFragment, stream << " not enough tokens ");

not enough or too many?
Comment 25 Megan Gardner 2021-10-13 13:42:14 PDT
Comment on attachment 441016 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441016&action=review

>> Source/WebCore/dom/FragmentDirectiveParser.cpp:81
>> +            tokens.append(token.toString());
> 
> Could this be a Vector<StringView>?
> nit: I think WTF::map could be used here to be more concise.

I tried to do a vector of StringViews, but in the end, we want strings, and the change was easier to do here than later.
Comment 26 Megan Gardner 2021-10-13 15:04:32 PDT
Created attachment 441141 [details]
Patch
Comment 27 Chris Dumez 2021-10-13 15:10:40 PDT
Comment on attachment 441141 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441141&action=review

> Source/WebCore/dom/Document.h:1635
> +    void setFragmentDirective(String& fragmentDirective) { m_fragmentDirective = fragmentDirective; }

const String&

> Source/WebCore/dom/FragmentDirectiveParser.cpp:92
> +                continue;

What's the point of this continue statement? Seems like a no-op.

Maybe we could do this logging in the `for (auto token : textDirective.split(','))` loop above to avoid iterating over the tokens again?
Comment 28 Megan Gardner 2021-10-13 15:36:03 PDT
Created attachment 441148 [details]
Patch
Comment 29 Chris Dumez 2021-10-13 15:40:17 PDT
Comment on attachment 441148 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441148&action=review

> Source/WebCore/dom/FragmentDirectiveParser.cpp:82
> +        bool emptyToken = false;

The name is a bit unclear for a boolean. I think it would be nice to rename to containsEmptyToken or similar.

> Source/WebCore/dom/FragmentDirectiveParser.cpp:128
> +    

Extra blank line.
Comment 30 Chris Dumez 2021-10-13 15:40:41 PDT
Comment on attachment 441148 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441148&action=review

> Source/WTF/ChangeLog:6
> +        Reviewed by NOBODY (OOPS!).

Feel free to put my name in the changelog since I already r+
Comment 31 Megan Gardner 2021-10-13 15:49:29 PDT
Created attachment 441153 [details]
Patch
Comment 32 Alex Christensen 2021-10-13 15:56:58 PDT
Comment on attachment 441153 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441153&action=review

> Source/WebCore/dom/FragmentDirectiveParser.cpp:58
> +    parseFragmentDirective(fragmentDirective);

The return value of this isn't used.  Do you want to check if it returns false and early return, or make it return void?
Comment 33 Megan Gardner 2021-10-13 18:12:00 PDT
Created attachment 441170 [details]
Patch for landing
Comment 34 EWS 2021-10-13 19:12:39 PDT
Committed r284143 (242963@main): <https://commits.webkit.org/242963@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 441170 [details].
Comment 35 Radar WebKit Bug Importer 2021-10-13 19:13:17 PDT
<rdar://problem/84228621>
Comment 36 WebKit Commit Bot 2021-10-16 16:00:09 PDT
Re-opened since this is blocked by bug 231870
Comment 37 Tim Horton 2021-10-16 16:35:54 PDT
Reverted r284143 for reason:

Caused a series of assertions in layout tests in EWS

Committed r284326 (243121@main): <https://commits.webkit.org/243121@main>
Comment 38 Tim Horton 2021-10-16 16:43:27 PDT
Reverted the in https://trac.webkit.org/changeset/284326/webkit
I wasn't able to repro the issue from https://bugs.webkit.org/show_bug.cgi?id=231752 locally either, so I didn't manage to fix it, we'll just have to try again; hopefully someone else knows what the assertion is about.
Comment 39 Chris Dumez 2021-10-16 18:52:34 PDT
Comment on attachment 441170 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=441170&action=review

> Source/WebCore/dom/FragmentDirectiveParser.cpp:60
> +    m_fragmentDirective = fragmentDirective.toString();

This is wrong. You are allocating a new temporary String and then creating a StringView to that temporary String, since m_fragmentDirective is a StringView. This is very likely what’s causing the crashes.
Comment 40 Chris Dumez 2021-10-16 18:52:35 PDT
Comment on attachment 441170 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=441170&action=review

> Source/WebCore/dom/FragmentDirectiveParser.cpp:60
> +    m_fragmentDirective = fragmentDirective.toString();

This is wrong. You are allocating a new temporary String and then creating a StringView to that temporary String, since m_fragmentDirective is a StringView. This is very likely what’s causing the crashes.