Bug 5610

Summary:

-[WebCoreBridge convertNSRangeToDOMRange] often returns wrong or "unusable" results

Product:

WebKit

Reporter:

Evan Gross <evan>

Component:

HTML Editing

Assignee:

Alexey Proskuryakov <ap>

Status:

NEW

Severity:

Normal

CC:

ap, bburg, ian, justin.garcia, rniwa

Priority:

Version:

420+

Hardware:

Mac

OS:

OS X 10.4

Attachments:

Description	Flags
Modified Blot app to illustrate the results of convertNSRangeToDOMRange and what goes wrong when it fails	none
some automated test cases	none
proposed patch	justin.garcia: review-

Evan Gross

Reported 2005-11-03 02:07:25 PST

It's looking like one of the main reasons that bug 4680, bug 5401, bug 5445, bug 4682 and perhaps other clients (really not many besides these NSTextInput methods. There may well be other places that other AppKit APIs that have only an NSRange at their disposal that need to use convertNSRangeToDOMRange. Spelling Panel (perhaps), AX APIs, AppleScripting perhaps? Currently, after a bit of range checking, TextIterator::rangeFromLocationAndLength() is used to figure out the resulting DOM range. We've tried to fix it up a bit in Bug 4682 (attachment (id=4424)), but think that this isn't the right approach. What this method (function it calls, whatever) REALLY needs to do is return the EXACT DOMRange you'd get from -[WebCoreBridge selectedDOMRange] just as though the particular NSRange you want to convert were actually the document's selected range. The above appears to be necessary to get predictable results from a number of methods: KWQKHTMLPart::attributedString() - if the values you supply for its arguments aren't quite right, you get, well, incorrect results. It appears that it may be built on the assumption that they've come from some sort of "selected range". Not totally sure, as this is just a monster function that I have no time/ intention of really acquiring a deep knowledge of. -[WebCoreBridge stringForRange:] is given a DOMRange that may have been converted from an NSRange. Seems to be the same story as above. -[WebCoreBridge selectNSRange:] also converts an NSRange to a DOMRange. No idea if there are issues with setting the selection to the current DOMRange returned from convertNSRangeToDOMRange. I think there is only one, rarely used client of this particular method. SO, TO FIX THIS - we probably need to use VisiblePosition, SelectionController, and (unfortunately due to current performance) CharacterIterator to create an exact replica of the range you'd get if the given selection consisted of the given (arbitrary, as it's NOT actually the selected range in the document!) NSRange. There are places where something like this is done, but it's not quite the same. The code that deals with finding/marking spellings is one, there are a couple of text search/find methods (KWQKHTMLPart.mm), perhaps somewhere in the accessibility code. I've tried to come up with something, but just don't know my way around this stuff well enough to get it right. Again, a plea for help... It's simple to test this stuff just with console logging. It's hard with layout tests because there are so many possible variations of text and ranges of text with it. What I have been doing is simply playing with making selections in Blot with this code pasted into -[WebHTMLView _selectionChanged], then analyzing the output. Selections enclosing line endings, at/before line breaks, and many others will give incorrect results. Easy to see - make the selection in Blot, take a look at what was spewed to the Console. Copy/Paste this in: - (void)_selectionChanged { [self _updateSelectionForInputManager]; [self _updateFontPanel]; _private->startNewKillRingSequence = YES; #if 1 WebBridge *bridge = [self _bridge]; DOMRange *selectedDOMRange = [bridge selectedDOMRange]; DOMRange *range = [bridge convertNSRangeToDOMRange:[bridge selectedNSRange]]; NSLog(@"++++_selectionChanged - selectedNSRange:%@\n1. selectedDOMRange:\n%@ and backtoNS:%@\n(stringForRange:%@ len:%d)\n2. convertNSRangeToDOMRange:\n%@ and backtoNS:%@ \n(stringForRange:%@ len:%d)\n", NSStringFromRange([bridge selectedNSRange]), selectedDOMRange, NSStringFromRange([bridge convertDOMRangeToNSRange:selectedDOMRange]), [bridge stringForRange:selectedDOMRange], [[bridge stringForRange:selectedDOMRange] length], range, NSStringFromRange([bridge convertDOMRangeToNSRange:range]), [bridge stringForRange:range], [[bridge stringForRange:range] length]); NSAttributedString *attributedSubString = [bridge attributedStringFrom:[selectedDOMRange startContainer] startOffset:[selectedDOMRange startOffset] to:[selectedDOMRange endContainer] endOffset:[selectedDOMRange endOffset]]; NSAttributedString *attributedSubString2 = [bridge attributedStringFrom:[range startContainer] startOffset:[range startOffset] to:[range endContainer] endOffset:[range endOffset]]; NSLog(@"attributedSubString 1:%@ len:%d\nattributedSubString 2:%@len:%d", attributedSubString, [attributedSubString length], attributedSubString2, [attributedSubString2 length]); if ([range startContainer] != [selectedDOMRange startContainer] || [range startOffset] != [selectedDOMRange startOffset] || [range endContainer] != [selectedDOMRange endContainer] || [range endOffset] != [selectedDOMRange endOffset]) NSLog(@"ranges DIFFERENT!"); if (![attributedSubString isEqualToAttributedString:attributedSubString2]) NSLog(@"attributed strings DIFFERENT!"); #endif } Convince yourself (if needed) that something's very wrong. Re-do the logging however you see fit.

Attachments
Modified Blot app to illustrate the results of convertNSRangeToDOMRange and what goes wrong when it fails (124.13 KB, application/octet-stream) 2005-12-23 23:17 PST, Evan Gross	no flags	Details
some automated test cases (4.40 KB, application/zip) 2006-01-02 00:42 PST, Alexey Proskuryakov	no flags	Details
proposed patch (14.99 KB, patch) 2006-01-03 14:35 PST, Alexey Proskuryakov	justin.garcia: review-	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Evan Gross

Comment 1 2005-11-03 02:09:40 PST

Whoops, the first sentence should begin: It's looking like *this bug is* one of the main reasons that bug 4680, bug 5401, bug 5445, bug 4682 and perhaps other clients...

Evan Gross

Comment 2 2005-11-03 02:16:40 PST

I've taken a crack, but realize I really have no clue how to properly use some of these classes, let alone what to do in order to make sure the result is a range that would *exactly* match the one returned if the given range of characters constituted the selection. It doesn't work properly at the moment in a few cases, like carets at the end of a word before a line break. Maybe someone that knows this stuff can spot (how much is?) what's wrong. Or perhaps a better approach! - (SharedPtr<DOM::RangeImpl>)convertToDOMRange:(NSRange)nsrange { SharedPtr<RangeImpl> result(_part->xmlDocImpl()->createRange()); CharacterIterator it(rangeOfContents(_part->xmlDocImpl()).get()); int exception = 0; if (!it.atEnd()) it.advance(nsrange.location); result->setStart(it.range()->startContainer(exception), it.range()->startOffset(exception), exception); it.string(nsrange.length); result->setEnd(it.range()->startContainer(exception), it.range()->startOffset(exception), exception); return SelectionController(result.get(), DOWNSTREAM, VP_UPSTREAM_IF_POSSIBLE).toRange(); } - (DOMRange *)convertNSRangeToDOMRange:(NSRange)nsrange { if (nsrange.location > INT_MAX) return 0; if (nsrange.length > INT_MAX || nsrange.location + nsrange.length > INT_MAX) nsrange.length = INT_MAX - nsrange.location; SharedPtr<RangeImpl> result([self convertToDOMRange:nsrange]); return [DOMRange _rangeWithImpl:result.get()]; } (I think the current code leaks a RangeImpl - am I correct? And another one in smartDeleteRangeForProposedRange (RangeImpl * created near the end?)

Evan Gross

Comment 3 2005-12-23 23:17:41 PST

Created attachment 5256 [details] Modified Blot app to illustrate the results of convertNSRangeToDOMRange and what goes wrong when it fails Run this version of Blot with TOT. Open the test document in the archive. Triple-click the first line just to get started. View in the inspector window (on-the-fly as you change the selection) the results of convertNSRangeToDOMRange (and its inverse), and the attributed string obtained three different ways. When the DOMRange's don't match up, they are drawn in red. When the attributed strings don't match up with the results of [bridge selectedAttributedString], the background is drawn in yellow. Just play for 30 seconds, you'll see there are real problems here...

Alexey Proskuryakov

Comment 4 2005-12-24 02:59:26 PST

At the moment, I see two separate problems with range conversion: 1) DIV-based line feeds cause a lot of problems. 2) DOM ranges for empty selections are different (for an empty document, words with style changes etc). Anything I'm missing? Worst of all, I still don't know what is correct conceptually (whether DOM or render trees should be used for such conversions).

Evan Gross

Comment 5 2005-12-24 04:25:37 PST

(In reply to comment #4) > Worst of all, I still don't know what is correct conceptually (whether DOM or render trees should be used > for such conversions). It seems to me that all the machinery that deals with editing/text/strings and DOMRange's expect as input (or simply work best with) the selected range. I see no wrong results from selectedAttributedString... Might be time to take a different approach as I mentioned before. We need a DOMRange that exactly matches the one you'd get if you positioned the insertion point at range.location and then extended the "selection" by range.length characters. We need the selected DOMRange, as though the user placed the insertion point and expanded it by range.length characters - BUT without actually selecting anything in the document. I tried, made some progress, then just gave up because I just didn't really "get" the right way to do it. Thoughts? Any closer to confirming this?

Evan Gross

Comment 6 2005-12-24 04:28:01 PST

Sorry, baked line wrapping from my Treo 650...

Alexey Proskuryakov

Comment 7 2005-12-26 14:10:51 PST

I think I finally know what to fix here :) TextIterator often appends text generated from non-text nodes to other nodes that it finds suitable. This is OK for what it does internally, but it 1) breaks KWQKHTMLPart::attributedString in a lot of cases; 2) makes writing a correct attributedString accessor based on TextIterator machinery hard, to say the least. I'm not quite sure if Selection/VisiblePosition code can or should have anything to do with this. Comments are welcome.

Alexey Proskuryakov

Comment 8 2006-01-02 00:42:31 PST

Created attachment 5416 [details] some automated test cases

Alexey Proskuryakov

Comment 9 2006-01-03 14:35:50 PST

Created attachment 5459 [details] proposed patch The issue here is that TextIterator ranges aren't expected to work with TextInput at all - plain and attributed strings can rightfully have different lengths (for example, this happens for strings containing inline images). So, even merging TextIterator with VisiblePosition is not a solution here (although it still may be a possible future direction for other reasons). To really make TextInput work, TextIterator should probably get a mode for generating attributed strings (and thus supersede KWQKHTMLPart::attributedString()). However, there are some issues with TextIterator::rangeFromLocationAndLength() which are common for plain and attributed strings, so fixing these should positively affect both.

Darin Adler

Comment 10 2006-01-04 13:06:30 PST

Comment on attachment 5459 [details] proposed patch Looks like there's no call to setEnd at all in the case where startOffset == endOffset. Is that OK?

Alexey Proskuryakov

Comment 11 2006-01-04 15:21:42 PST

(In reply to comment #10) In this case, setEnd should have been called on the previous iteration (when rangeEnd was equal to docTextPosition+len).

Justin Garcia

Comment 12 2006-01-04 18:29:41 PST

I have a question about your changes to ::handleNonTextNode(). The ranges returned for emitted characters seem wrong to me with or without your patch. Here's an example: <div id="parent"> 123 <div id="child">456</div> 789 </div> This is rendered as: 123 456 789 Before your changes, TextIterator returns the following ranges and lengths when iterating over the above snippit: textnode "123" offset 0 to 3 (length 3) div id="parent" offset 0 to 1 (length 1) textnode "456" offset 0 to 3 (length 3) div id="parent" offset 2 to 2 (length 1) textnode "789" offset 0 to 3 (length 3) After your changes: textnode "123" offset 0 to 3 (length 3) div id="child" offset 0 to 0 (length 1) textnode "456" offset 0 to 3 (length 3) div id="parent" offset 2 to 2 (length 1) textnode "789" offset 0 to 3 (length 3) Of course, the range associated with the second emitted '\n' didn't change because you didn't touch exitNode(). I think that the ranges associated with the first emitted '\n are wrong. I think the range should be: start: (textnode "123" offset 3) end: (textnode "456" offset 0) This would also be acceptable: start: (div id="parent" offset 1) end: (div id="child" offset 0) Find on Page is implemented with TextIterators. When someone searches for a \n, the range returned by the TextIterator should be able to be used to select something appropriate for a \n. Why do you want the ranges for emitted characters to be collapsed ranges?

Alexey Proskuryakov

Comment 13 2006-01-05 01:59:19 PST

(In reply to comment #12) This behavior is due to a difference between TextIterator::range() and the range returned from rangeFromLocationAndLength(). An iterator always has the same node as it start/end, and it's the job of rangeFromLocationAndLength() to make a usable range out of a sequence of single-node iterator positions. In your example, this indeed happens correctly with the patch applied, and the range returned for the first \n is start: (textnode "123" offset 3) end: (textnode "456" offset 0) (before testing, I have removed all the newlines from the source, because they were skewing the results). CharacterIterator and findPlainText() should also honor collapsed ranges in a similar way. At the moment, searching for strings with line breaks is unimplemented (there's an explicit check and a FIXME).

Justin Garcia

Comment 14 2006-01-05 11:04:19 PST

(In reply to comment #13) > (In reply to comment #12) > This behavior is due to a difference between TextIterator::range() and the range returned from > rangeFromLocationAndLength(). An iterator always has the same node as it start/end, and it's the job of > rangeFromLocationAndLength() to make a usable range out of a sequence of single-node iterator > positions. You're right, TextIterator::range() returns ranges that have the same node as the start and the end. But should this be so? Does it make more sense to change TextIterator to return ranges that make sense, instead of leaving that up to the clients of TextIterator?

Alexey Proskuryakov

Comment 15 2006-01-05 12:02:13 PST

(In reply to comment #14) Having the iterator only iterate over whole nodes makes it simpler conceptually, so I think this is a nice feature to keep... Also, some other clients rely on it (KWQAccObject has an assertion). Perhaps I will change my opinion as I get more experience with TextIterator, but at the moment I like the existing logic.

Justin Garcia

Comment 16 2006-01-05 15:16:34 PST

Yes, KWQAccObject does assert that the start/end containers of the range returned by the TextIterator are the same. But, the question is, in what way does it rely on this fact? The length == 0 case doesn't matter, because we're just changing ranges returned for emitted characters, which have the length == 1. In the length == 1 case, the startContainer is passed to AXAttributedStringAppendText. This method uses the startContainer to gather style information which it then uses to create the attributed string. So, if we wanted to change the ranges returned by TextIterator::range() in the way I described, we would just remove the assert and decide which node's style information to use for emitted characters when building an attributed string. Regardless of whether we use your approach or mine in this patch, it should include changes to other clients of TextIterator like findPlainText, or at least a defense of why such changes are unnecessary.

Justin Garcia

Comment 17 2006-01-13 14:52:00 PST

Comment on attachment 5459 [details] proposed patch r- for now, until the comments are addressed.

Blaze Burg

Comment 18 2016-08-03 10:53:35 PDT

I think we can close this bug, most of this code does not exist any more.

Note You need to log in before you can comment on or make changes to this bug.