55898 – REGRESSION (HTML5 tree builder): Text selection in a large text document is extremely slow

RESOLVED FIXED 55898

REGRESSION (HTML5 tree builder): Text selection in a large text document is extremely slow

https://bugs.webkit.org/show_bug.cgi?id=55898

Summary REGRESSION (HTML5 tree builder): Text selection in a large text document is e...

Alexey Proskuryakov

Reported 2011-03-07 13:08:45 PST

See bug URL. Selecting near the end of document is extremely slow. This regressed because a fix for bug 12833 has been reverted in order to match HTML5. Current releases of Safari and Firefox limit the size of parser inserted text nodes. This has been forbidden in HTML5 and dropped in implementations because client code can break when trying to match text that crosses an engine node boundary, and we shouldn't surprise JS developers like that. I'm not sure if this was a good tradeoff: 1) Most client code needs to deal with text crossing node boundaries anyway (e.g. for style changes, or for line breaks, or for DOM-inserted nodes). 2) We shouldn't surprise client code developers with huge amounts of text either. Just like we bend backwards to spare them from threading or reentrancy issues, we can limit the harm from non-linear algorithms here. 3) The failure mode for an O(N^2) algorithm on an unlimited length text (freezing) is often worse than missing a match in a huge text due to hitting a boundary. This has also been affecting clients of Objective C WebKit API (<rdar://problem/9013049> for those who can see it). There isn't much difference from JavaScript case, with one twist: when using Objective C autorelease pools, a client can easily take a huge hit in memory consumption, in addition to CPU performance. Regardless of what we do long term, I think that we need to re-introduce the node size limit in WebKit until we can sort out internal performance, and get clients updated.

Attachments
proposed fix (21.39 KB, patch) 2011-03-07 16:45 PST, Alexey Proskuryakov	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Adam Barth

Comment 1 2011-03-07 13:13:38 PST

I gave ap some pointers on IRC to where in the code needs to be changed. Eric seems to think he did this already, but I can't find any evidence of that.

Boris Zbarsky

Comment 2 2011-03-07 13:18:48 PST

For what it's worth, the most common compat issue we had with the old behavior in Gecko was sites using XHR and then extracting the text from a node in the responseXML. They _could_ use .textContent, but tended to use .firstChild.data, and this would break in random ways depending on the data... We did have to fix a few of the resulting performance issues in Gecko 2, yes. Wasn't that big a deal.

Eric Seidel (no email)

Comment 3 2011-03-07 15:51:11 PST

I'll dig around. I swear I implemented text node splitting when doing the new parser.

Alexey Proskuryakov

Comment 4 2011-03-07 16:34:10 PST

I have a patch that I'm testing now.

Alexey Proskuryakov

Comment 5 2011-03-07 16:45:38 PST

Created attachment 84996 [details] proposed fix Even if we improve text selection performance, add a quirk for Mail, and decide to not split the nodes, this code will still be needed for the quirk.

Darin Adler

Comment 6 2011-03-07 18:19:10 PST

Comment on attachment 84996 [details] proposed fix Looks good.

Eric Seidel (no email)

Comment 7 2011-03-07 18:56:06 PST

Comment on attachment 84996 [details] proposed fix OK.

WebKit Commit Bot

Comment 8 2011-03-07 19:18:05 PST

Comment on attachment 84996 [details] proposed fix Clearing flags on attachment: 84996 Committed r80526: <http://trac.webkit.org/changeset/80526>

WebKit Commit Bot

Comment 9 2011-03-07 19:18:11 PST

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component DOM

Assignee

Alexey Proskuryakov

Reported

2011-03-07 13:08 PST

Modified

2011-03-07 19:18 PST History

CC List

9 users Show

URL

http://dscoder.com/test.txt

Keywords

Depends on

Blocks

41115

Dependencies

tree graph