Bug 190107

Summary: Use Position instead of Range in createMarkupInternal
Product: WebKit Reporter: Ryosuke Niwa <rniwa>
Component: HTML EditingAssignee: Ryosuke Niwa <rniwa>
Status: RESOLVED FIXED    
Severity: Normal CC: darin, ews-watchlist, koivisto, webkit-bug-importer, wenson_hsieh
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 190108    
Attachments:
Description Flags
Cleanup
none
Cleanup darin: review+

Description Ryosuke Niwa 2018-09-28 21:08:23 PDT
Rename createMarkupInternal to serializePreservingVisualAppearanceInternal to match the rename done in the bug 190086,
and use two Position's for start and end instead of Range in preparation to support copy & paste across shadow boundaries.
Comment 1 Ryosuke Niwa 2018-09-28 21:24:16 PDT
Created attachment 351168 [details]
Cleanup
Comment 2 Ryosuke Niwa 2018-09-28 21:25:28 PDT
Created attachment 351169 [details]
Cleanup
Comment 3 Radar WebKit Bug Importer 2018-09-28 21:26:07 PDT
<rdar://problem/44882459>
Comment 4 EWS Watchlist 2018-09-28 21:26:57 PDT
Attachment 351169 [details] did not pass style-queue:


ERROR: Source/WebCore/editing/markup.cpp:283:  Code inside a namespace should not be indented.  [whitespace/indent] [4]
ERROR: Source/WebCore/editing/markup.cpp:283:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/WebCore/editing/markup.cpp:284:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
ERROR: Source/WebCore/editing/markup.cpp:285:  Wrong number of spaces before statement. (expected: 8)  [whitespace/indent] [4]
Total errors found: 4 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Ryosuke Niwa 2018-09-28 22:47:14 PDT
Hm... WinCairo EWS bot appears to be sick :( Will email webkit-dev.
Comment 6 Ryosuke Niwa 2018-09-28 23:37:11 PDT
Once this patch is landed, I'm fixing a bug that line breaks inserted for interchange ends up getting inserted inside the last table cell when copying & pasting a whole table, which is demonstrated in LayoutTests/editing/pasteboard/paste-table-003.html

See the bug 190108 for the fix dependent on this refactoring.
Comment 7 Darin Adler 2018-09-30 19:31:51 PDT
Comment on attachment 351169 [details]
Cleanup

View in context: https://bugs.webkit.org/attachment.cgi?id=351169&action=review

This change is definitely a good idea. Using Range internally in WebKit code is not a great pattern for multiple reasons; creating and destroying Range objects is quite costly because of they feature where ranges automatically update as a document is mutated.

On the other hand, maybe some day we will want to use a structure holding two Position objects for this kind of thing rather than two separate Position objects.

Functions that take two Position objects need to be prepared for them to be out of order. The same is not true for functions that take Range objects. I am concerned that we don’t get this exactly right, although I didn't spot any obvious problems here.

> Source/WebCore/dom/Position.cpp:253
> +    auto container = makeRefPtr(containerNode());

Seems strange to use a RefPtr here, but not for the result of computeNodeAfterPosition. Neither needs it.

> Source/WebCore/dom/Position.cpp:256
> +    if (is<CharacterData>(container))

Can be "*character"; might save a null check or might have no effect.

> Source/WebCore/dom/Position.h:106
> +    RefPtr<Node> firstNode() const;

In the old days this would return a Node*, not a RefPtr<Node>, but maybe it’s good design direction for newly written functions to use RefPtr to make the use of the function safer.

> Source/WebCore/editing/MarkupAccumulator.cpp:123
>      , m_prefixLevel(0)

Should initialize this in class definition.

> Source/WebCore/editing/markup.cpp:223
> +    StyledMarkupAccumulator(const Position& start, const Position& end, Vector<Node*>* nodes,
> +        ResolveURLs, AnnotateForInterchange, MSOListMode, bool needsPositionStyleConversion, Node* highestNodeToBeSerialized = nullptr);

I think that soon this class should be broken out into its own source file.

> Source/WebCore/editing/markup.cpp:396
> +    if (m_start.isNull() && m_end.isNull())
> +        return text.data();

This special case is not needed; the code below will do the same. I also don’t think it’s an important performance optimization.

> Source/WebCore/editing/markup.cpp:398
> +    String textData = text.data();

We don’t really need this local variable.

> Source/WebCore/editing/markup.cpp:400
> +    unsigned end = textData.length();

This could be std::numeric_limits<unsigned>::max() with no effect on the behavior of the function; substring clamps properly.

> Source/WebCore/editing/markup.cpp:405
> +    ASSERT(start < end);

This is a very late moment to check this invariant; the mistake that made it untrue will be *very* far from here. It’s possible that m_start and m_end will both point to the same text node, and the offsets will be out of order with the start one after the end one. I don’t know where the checks that guarantee that won’t happen are; I don’t think they are anywhere inside the StyledMarkupAccumulator class.

I think this is a fragile way to construct the code. One possible solution would be to make this function robust in the face of such unusual inputs, which would be very easy to do, just replace "end - start" with zero-clamping subtraction rather than standard subtraction that will underflow and result in a large number. Another solution would be to enforce the invariant in the constructor.

This is a new problem that did not exist when we were using Range, since the Range constructor does enforce this invariant.

> Source/WebCore/editing/markup.cpp:509
> +    Node* startNode = start.firstNode().get();

This line of code seems dangerous. Why is it OK to just call get on a RefPtr and then let it fall out of scope? Doing get() here seems inconsistent with the design direction that inspired us to make the firstNode function return a RefPtr.

> Source/WebCore/editing/markup.cpp:762
> +    if (start.isNull() || end.isNull() || !comparePositions(start, end))

The two null checks here aren’t needed. The comparePositions function already returns 0 if either of the two positions is null.

But also, it seems a bit peculiar and a bit expensive to call comparePositions here and it’s not clear what the check is intended to do. After all "!comparePositions(a, b)" is roughly the same as "a == b". The expensive operation is figuring out -1 vs. +1, not 0 vs. non-0. Is the check truly required?

> Source/WebCore/editing/markup.h:49
> +class VisibleSelection;

Why are we adding this in this patch? Maybe it’s part of the next patch?

> Source/WebCore/editing/markup.h:74
> -WEBCORE_EXPORT String serializePreservingVisualAppearance(const Range&, Vector<Node*>* = nullptr, AnnotateForInterchange = AnnotateForInterchange::No, ConvertBlocksToInlines = ConvertBlocksToInlines::No, ResolveURLs = ResolveURLs::No);
> +WEBCORE_EXPORT String serializePreservingVisualAppearance(const Range&, Vector<Node*>* = nullptr,
> +    AnnotateForInterchange = AnnotateForInterchange::No, ConvertBlocksToInlines = ConvertBlocksToInlines::No, ResolveURLs = ResolveURLs::No);

This change just seems to be adding a line break. Can we omit it? Maybe it makes more sense in the follow-up patch.
Comment 8 Ryosuke Niwa 2018-09-30 22:03:52 PDT
(In reply to Darin Adler from comment #7)
> Comment on attachment 351169 [details]
> Cleanup
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=351169&action=review
> 
> This change is definitely a good idea. Using Range internally in WebKit code
> is not a great pattern for multiple reasons; creating and destroying Range
> objects is quite costly because of they feature where ranges automatically
> update as a document is mutated.
> 
> On the other hand, maybe some day we will want to use a structure holding
> two Position objects for this kind of thing rather than two separate
> Position objects.

Yes!

We already have VisibleSelection but that one really corresponds to VisibleSelection
so I think we want some kind of PositionRange / PositionPair / etc...

In fact, if I remember correctly, some parts of accessibility code
uses some struct which stores information equivalent to Position.

> Functions that take two Position objects need to be prepared for them to be
> out of order. The same is not true for functions that take Range objects. I
> am concerned that we don’t get this exactly right, although I didn't spot
> any obvious problems here.

Yeah, that is definitely a concern if this pattern starts getting used in more places.
Comment 9 Ryosuke Niwa 2018-09-30 22:23:56 PDT
Comment on attachment 351169 [details]
Cleanup

View in context: https://bugs.webkit.org/attachment.cgi?id=351169&action=review

Thanks for the review!

>> Source/WebCore/dom/Position.cpp:253
>> +    auto container = makeRefPtr(containerNode());
> 
> Seems strange to use a RefPtr here, but not for the result of computeNodeAfterPosition. Neither needs it.

Yeah, I'm trying to avoid using raw pointers.

>> Source/WebCore/dom/Position.cpp:256
>> +    if (is<CharacterData>(container))
> 
> Can be "*character"; might save a null check or might have no effect.

Will fix.

>> Source/WebCore/dom/Position.h:106
>> +    RefPtr<Node> firstNode() const;
> 
> In the old days this would return a Node*, not a RefPtr<Node>, but maybe it’s good design direction for newly written functions to use RefPtr to make the use of the function safer.

I think the goal of no-raw pointer is to avoid having to think about whether this function is safe to use a raw pointer, etc...
I tend to agree that this function maybe simple enough that it doesn't warrant a RefPtr
but I think that kind of reasoning can be confusing and may lead to bugs.
We really need to figure out & solidify what the "correct" rule for when & when not to use raw pointers.
For now, I'm gonna stick with RefPtr.

>> Source/WebCore/editing/MarkupAccumulator.cpp:123
>>      , m_prefixLevel(0)
> 
> Should initialize this in class definition.

Will fix.

>> Source/WebCore/editing/markup.cpp:223
>> +        ResolveURLs, AnnotateForInterchange, MSOListMode, bool needsPositionStyleConversion, Node* highestNodeToBeSerialized = nullptr);
> 
> I think that soon this class should be broken out into its own source file.

Yes. In fact, this class probably shouldn't even inherit from MarkupAccumulator.
This class traverses nodes completely differently from MarkupAccumulator, etc...
I think a better design would be for MarkupAccumulator to expose some helper functions, which StyledMarkupAccumulator uses.

>> Source/WebCore/editing/markup.cpp:396
>> +        return text.data();
> 
> This special case is not needed; the code below will do the same. I also don’t think it’s an important performance optimization.

That's good point. Will remove.

>> Source/WebCore/editing/markup.cpp:398
>> +    String textData = text.data();
> 
> We don’t really need this local variable.

Sure, will remove.

>> Source/WebCore/editing/markup.cpp:400
>> +    unsigned end = textData.length();
> 
> This could be std::numeric_limits<unsigned>::max() with no effect on the behavior of the function; substring clamps properly.

That's a good point. Will do.

>> Source/WebCore/editing/markup.cpp:405
>> +    ASSERT(start < end);
> 
> This is a very late moment to check this invariant; the mistake that made it untrue will be *very* far from here. It’s possible that m_start and m_end will both point to the same text node, and the offsets will be out of order with the start one after the end one. I don’t know where the checks that guarantee that won’t happen are; I don’t think they are anywhere inside the StyledMarkupAccumulator class.
> 
> I think this is a fragile way to construct the code. One possible solution would be to make this function robust in the face of such unusual inputs, which would be very easy to do, just replace "end - start" with zero-clamping subtraction rather than standard subtraction that will underflow and result in a large number. Another solution would be to enforce the invariant in the constructor.
> 
> This is a new problem that did not exist when we were using Range, since the Range constructor does enforce this invariant.

Okay, that's a good point. Let me add ASSERT(comparePositions(start, end) <= 0) in serializeNodes.
I was going to do that but I guess I forgot to actually do it.

>> Source/WebCore/editing/markup.cpp:509
>> +    Node* startNode = start.firstNode().get();
> 
> This line of code seems dangerous. Why is it OK to just call get on a RefPtr and then let it fall out of scope? Doing get() here seems inconsistent with the design direction that inspired us to make the firstNode function return a RefPtr.

Oh oops, that's just a mistake. Will fix.

>> Source/WebCore/editing/markup.cpp:762
>> +    if (start.isNull() || end.isNull() || !comparePositions(start, end))
> 
> The two null checks here aren’t needed. The comparePositions function already returns 0 if either of the two positions is null.
> 
> But also, it seems a bit peculiar and a bit expensive to call comparePositions here and it’s not clear what the check is intended to do. After all "!comparePositions(a, b)" is roughly the same as "a == b". The expensive operation is figuring out -1 vs. +1, not 0 vs. non-0. Is the check truly required?

I wanted to guard against the code which invoked this function with end pointing a position before start.
Note that comparePositions doesn't traverse through all nodes, etc...
It would find the first ancestor node which contains either position and finds which one comes first.
Anyhow, I don't think this will be the expensive part of this function.

>> Source/WebCore/editing/markup.h:49
>> +class VisibleSelection;
> 
> Why are we adding this in this patch? Maybe it’s part of the next patch?

Oh oops, yes. Will revert this from this patch.

>> Source/WebCore/editing/markup.h:74
>> +    AnnotateForInterchange = AnnotateForInterchange::No, ConvertBlocksToInlines = ConvertBlocksToInlines::No, ResolveURLs = ResolveURLs::No);
> 
> This change just seems to be adding a line break. Can we omit it? Maybe it makes more sense in the follow-up patch.

Hm... this line looked a bit too long to but okay. Will revert.
Comment 10 Ryosuke Niwa 2018-09-30 22:30:04 PDT
Committed r236649: <https://trac.webkit.org/changeset/236649>