WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 213318
Text manipulation does not observe manipulated text after update
https://bugs.webkit.org/show_bug.cgi?id=213318
Summary
Text manipulation does not observe manipulated text after update
Sihui Liu
Reported
2020-06-17 15:13:30 PDT
...
Attachments
Patch
(9.93 KB, patch)
2020-06-17 15:18 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(9.97 KB, patch)
2020-06-17 15:20 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(9.85 KB, patch)
2020-06-26 12:46 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(12.49 KB, patch)
2020-06-26 15:21 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(12.02 KB, patch)
2020-06-26 16:25 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch
(12.74 KB, patch)
2020-06-29 09:58 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Patch for landing
(12.72 KB, patch)
2020-07-13 10:01 PDT
,
Sihui Liu
no flags
Details
Formatted Diff
Diff
Show Obsolete
(6)
View All
Add attachment
proposed patch, testcase, etc.
Sihui Liu
Comment 1
2020-06-17 15:14:42 PDT
<
rdar://problem/63766703
>
Sihui Liu
Comment 2
2020-06-17 15:18:39 PDT
Created
attachment 402159
[details]
Patch
Sihui Liu
Comment 3
2020-06-17 15:20:36 PDT
Created
attachment 402160
[details]
Patch
Sihui Liu
Comment 4
2020-06-26 12:46:54 PDT
Created
attachment 402887
[details]
Patch
Darin Adler
Comment 5
2020-06-26 12:49:32 PDT
Comment on
attachment 402887
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=402887&action=review
> Source/WebCore/dom/CharacterData.cpp:70 > + if (is<Text>(*this) && !m_data.contains(oldData)) {
This contains check seems peculiar. The new data might happen to contain the old data by coincidence; not sure if it’s a meaningful check. Also, the is<Text> check seems peculiar. Normally this would be done by making the setData function virtual rather than adding a derived class check.
Wenson Hsieh
Comment 6
2020-06-26 13:50:41 PDT
Comment on
attachment 402887
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=402887&action=review
>> Source/WebCore/dom/CharacterData.cpp:70 >> + if (is<Text>(*this) && !m_data.contains(oldData)) { > > This contains check seems peculiar. The new data might happen to contain the old data by coincidence; not sure if it’s a meaningful check. > > Also, the is<Text> check seems peculiar. Normally this would be done by making the setData function virtual rather than adding a derived class check.
I believe `!m_data.contains(oldData)` is just a translation-specific heuristic in this case, since the translation framework does not handle text that has already been partially translated to the target language. For instance, if we have a text node like "one two" that is translated to "一 二", and then the page's script later sets this text to "three four", we'd want to re-mark it for translation so that it can be translated to "三 四". However, if the web page appends new text in the source language — for example, "一 二 three four" — it is better to avoid re-translating this text. That said, it does sound expensive to check for `contains` every time here; maybe we should just omit the `contains` check, or only do it if the text manipulation controller exists?
Darin Adler
Comment 7
2020-06-26 13:51:30 PDT
Comment on
attachment 402887
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=402887&action=review
>>> Source/WebCore/dom/CharacterData.cpp:70 >>> + if (is<Text>(*this) && !m_data.contains(oldData)) { >> >> This contains check seems peculiar. The new data might happen to contain the old data by coincidence; not sure if it’s a meaningful check. >> >> Also, the is<Text> check seems peculiar. Normally this would be done by making the setData function virtual rather than adding a derived class check. > > I believe `!m_data.contains(oldData)` is just a translation-specific heuristic in this case, since the translation framework does not handle text that has already been partially translated to the target language. For instance, if we have a text node like "one two" that is translated to "一 二", and then the page's script later sets this text to "three four", we'd want to re-mark it for translation so that it can be translated to "三 四". However, if the web page appends new text in the source language — for example, "一 二 three four" — it is better to avoid re-translating this text. > > That said, it does sound expensive to check for `contains` every time here; maybe we should just omit the `contains` check, or only do it if the text manipulation controller exists?
Can contains be a coincidence?
Wenson Hsieh
Comment 8
2020-06-26 13:59:09 PDT
Comment on
attachment 402887
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=402887&action=review
>>>> Source/WebCore/dom/CharacterData.cpp:70 >>>> + if (is<Text>(*this) && !m_data.contains(oldData)) { >>> >>> This contains check seems peculiar. The new data might happen to contain the old data by coincidence; not sure if it’s a meaningful check. >>> >>> Also, the is<Text> check seems peculiar. Normally this would be done by making the setData function virtual rather than adding a derived class check. >> >> I believe `!m_data.contains(oldData)` is just a translation-specific heuristic in this case, since the translation framework does not handle text that has already been partially translated to the target language. For instance, if we have a text node like "one two" that is translated to "一 二", and then the page's script later sets this text to "three four", we'd want to re-mark it for translation so that it can be translated to "三 四". However, if the web page appends new text in the source language — for example, "一 二 three four" — it is better to avoid re-translating this text. >> >> That said, it does sound expensive to check for `contains` every time here; maybe we should just omit the `contains` check, or only do it if the text manipulation controller exists? > > Can contains be a coincidence?
Indeed — it could very well be a coincidence. For instance, maybe the original text was just a single space or symbol... My understanding is that this is just a heuristic that tends to work well in practice. @Sihui — were there any websites where this `contains` check is necessary for compat? If not, perhaps we could simplify this a bit by calling didUpdateContentForText if the text changed at all.
Sihui Liu
Comment 9
2020-06-26 14:08:18 PDT
(In reply to Wenson Hsieh from
comment #8
)
> Comment on
attachment 402887
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=402887&action=review
> > >>>> Source/WebCore/dom/CharacterData.cpp:70 > >>>> + if (is<Text>(*this) && !m_data.contains(oldData)) { > >>> > >>> This contains check seems peculiar. The new data might happen to contain the old data by coincidence; not sure if it’s a meaningful check. > >>> > >>> Also, the is<Text> check seems peculiar. Normally this would be done by making the setData function virtual rather than adding a derived class check. > >> > >> I believe `!m_data.contains(oldData)` is just a translation-specific heuristic in this case, since the translation framework does not handle text that has already been partially translated to the target language. For instance, if we have a text node like "one two" that is translated to "一 二", and then the page's script later sets this text to "three four", we'd want to re-mark it for translation so that it can be translated to "三 四". However, if the web page appends new text in the source language — for example, "一 二 three four" — it is better to avoid re-translating this text. > >> > >> That said, it does sound expensive to check for `contains` every time here; maybe we should just omit the `contains` check, or only do it if the text manipulation controller exists? > > > > Can contains be a coincidence? > > Indeed — it could very well be a coincidence. For instance, maybe the > original text was just a single space or symbol... > > My understanding is that this is just a heuristic that tends to work well in > practice. @Sihui — were there any websites where this `contains` check is > necessary for compat? If not, perhaps we could simplify this a bit by > calling didUpdateContentForText if the text changed at all.
This check is not necessary for the radar. I will remove it.
Sihui Liu
Comment 10
2020-06-26 14:52:12 PDT
(In reply to Darin Adler from
comment #5
)
> Comment on
attachment 402887
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=402887&action=review
> > > Source/WebCore/dom/CharacterData.cpp:70 > > + if (is<Text>(*this) && !m_data.contains(oldData)) { > > This contains check seems peculiar. The new data might happen to contain the > old data by coincidence; not sure if it’s a meaningful check. > > Also, the is<Text> check seems peculiar. Normally this would be done by > making the setData function virtual rather than adding a derived class check.
I found setData is called in multiple places: CharacterData::setNodeValue, replaceChildrenWithFragment, Text::replaceWholeText, etc, so I just added a general check. I can add didUpdateContentForText to each of them if it's preferred.
Sihui Liu
Comment 11
2020-06-26 15:21:46 PDT
Created
attachment 402912
[details]
Patch
Darin Adler
Comment 12
2020-06-26 15:38:37 PDT
Comment on
attachment 402912
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=402912&action=review
> Source/WebCore/dom/CharacterData.cpp:190 > + if (is<Text>(*this)) {
Can we make the setNodeValue function be virtual and overridden in the Text class, instead of doing if here? That’s the pattern for C++ programming. Or is that poorer for performance? And/or why isn’t this in the setData function? Are there calls to setData that must not do this?
Sihui Liu
Comment 13
2020-06-26 16:25:14 PDT
(In reply to Darin Adler from
comment #12
)
> Comment on
attachment 402912
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=402912&action=review
> > > Source/WebCore/dom/CharacterData.cpp:190 > > + if (is<Text>(*this)) { > > Can we make the setNodeValue function be virtual and overridden in the Text > class, instead of doing if here? That’s the pattern for C++ programming. Or > is that poorer for performance? >
Sure, will update the patch by overriding setData in Text.
> And/or why isn’t this in the setData function? Are there calls to setData > that must not do this?
No, I just misunderstood your previous comment.
Sihui Liu
Comment 14
2020-06-26 16:25:54 PDT
Created
attachment 402922
[details]
Patch
Darin Adler
Comment 15
2020-06-28 12:50:48 PDT
Comment on
attachment 402922
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=402922&action=review
> Source/WebCore/dom/CharacterData.h:35 > + WEBCORE_EXPORT virtual void setData(const String&);
Looking even more closely, it seems like the real bottleneck is CharacterData::setDataAndUpdate, called in various cases where setData is not called. So maybe it would be even better to put the new code there.
Sihui Liu
Comment 16
2020-06-29 09:57:08 PDT
(In reply to Darin Adler from
comment #15
)
> Comment on
attachment 402922
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=402922&action=review
> > > Source/WebCore/dom/CharacterData.h:35 > > + WEBCORE_EXPORT virtual void setData(const String&); > > Looking even more closely, it seems like the real bottleneck is > CharacterData::setDataAndUpdate, called in various cases where setData is > not called. > > So maybe it would be even better to put the new code there.
Sure, that should work too.
Sihui Liu
Comment 17
2020-06-29 09:58:52 PDT
Created
attachment 403077
[details]
Patch
Geoffrey Garen
Comment 18
2020-07-10 15:52:00 PDT
Comment on
attachment 403077
[details]
Patch r=me Looks like all comments have been addressed and incorporated.
Sihui Liu
Comment 19
2020-07-13 10:01:48 PDT
Created
attachment 404153
[details]
Patch for landing
EWS
Comment 20
2020-07-13 10:36:26 PDT
Committed
r264305
: <
https://trac.webkit.org/changeset/264305
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 404153
[details]
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug