Bug 127424

Summary: Crashes in setTextForIterator
Product: WebKit Reporter: peavo
Component: WebCore Misc.Assignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, bfulgham, commit-queue, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch none

Description peavo 2014-01-22 06:19:56 PST
I'm getting a crash in the function setTextForIterator in WebCore/platform/text/TextBreakIterator.cpp.

We need to reserve sufficient memory for the mem set in the function openLatin1UTextProvider in WebCore/platform/text/icu/UTextProviderLatin1.cpp.
Comment 1 peavo 2014-01-22 06:24:37 PST
Created attachment 221864 [details]
Patch
Comment 2 Alexey Proskuryakov 2014-01-22 09:47:32 PST
I'm wondering if this is the same as bug 127408.
Comment 3 Brent Fulgham 2014-01-22 10:07:32 PST
Comment on attachment 221864 [details]
Patch

I'd like better understanding of why we are off by one here. Are some routines expecting the buffer to hold a null termination?  I notice that most of the uses of UTextWIthBufferInlineCapacity are actually "UTextWithBufferInlineCapacity + 1" (for memsets, etc.)  Is that where the crash was occurring?
Comment 4 peavo 2014-01-22 10:13:19 PST
(In reply to comment #3)
> (From update of attachment 221864 [details])
> I'd like better understanding of why we are off by one here. Are some routines expecting the buffer to hold a null termination?  I notice that most of the uses of UTextWIthBufferInlineCapacity are actually "UTextWithBufferInlineCapacity + 1" (for memsets, etc.)  Is that where the crash was occurring?

No, the crash occurs at the end of setTextForIterator, because runtime checks detects that we have written past a stack variable (UTextWithBuffer textLocal, in setTextForIterator), and damaged the stack.
It is the memset that writes past the stack variable.
Comment 5 Brent Fulgham 2014-01-22 10:19:33 PST
(In reply to comment #4)
> (In reply to comment #3)
> > (From update of attachment 221864 [details] [details])
> > I'd like better understanding of why we are off by one here. Are some routines expecting the buffer to hold a null termination?  I notice that most of the uses of UTextWIthBufferInlineCapacity are actually "UTextWithBufferInlineCapacity + 1" (for memsets, etc.)  Is that where the crash was occurring?
> 
> No, the crash occurs at the end of setTextForIterator, because runtime checks detects that we have written past a stack variable (UTextWithBuffer textLocal, in setTextForIterator), and damaged the stack.
> It is the memset that writes past the stack variable.

I see. The call to "openLatin1UTextProvider" has the memset, and the memset always writes UTextWithBufferInlineCapacity + 1.

It's unclear if we should be resizing the buffer, or revising the uses of UTextWithBufferInlineCapacity to avoid the additional "+ 1".

For example, if we resize the buffer by one, is this math now wrong? (see UTextProviderLatin1.cpp line 125):

uText->chunkNativeLimit = uText->chunkNativeStart + UTextWithBufferInlineCapacity;
Comment 6 peavo 2014-01-22 10:24:44 PST
(In reply to comment #5)
> (In reply to comment #4)
> 
> It's unclear if we should be resizing the buffer, or revising the uses of UTextWithBufferInlineCapacity to avoid the additional "+ 1".
> 

Good point, that might be the case.

> For example, if we resize the buffer by one, is this math now wrong? (see UTextProviderLatin1.cpp line 125):
> 
> uText->chunkNativeLimit = uText->chunkNativeStart + UTextWithBufferInlineCapacity;

I don't think this patch changes any logic, as we only allocate more space, without changing the UTextWithBufferInlineCapacity constant.
Comment 7 Brent Fulgham 2014-01-22 10:29:19 PST
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > 
> > It's unclear if we should be resizing the buffer, or revising the uses of UTextWithBufferInlineCapacity to avoid the additional "+ 1".
> > 
> 
> Good point, that might be the case.
> 
> > For example, if we resize the buffer by one, is this math now wrong? (see UTextProviderLatin1.cpp line 125):
> > 
> > uText->chunkNativeLimit = uText->chunkNativeStart + UTextWithBufferInlineCapacity;
> 
> I don't think this patch changes any logic, as we only allocate more space, without changing the UTextWithBufferInlineCapacity constant.

I probably used a bad example. Consider Line 135 in the same file:

uText->chunkNativeStart = uText->chunkNativeLimit -  UTextWithBufferInlineCapacity;

Now we position are starting point in the final element of the buffer, rather than one past it. Doesn't that mean we will get improper results?
Comment 8 Brent Fulgham 2014-01-22 10:36:13 PST
Comment on attachment 221864 [details]
Patch

Looking through the code, there are numerous places where sizeof(buffer) is used, and others where UTextWithBufferInlineCapacity is used.

I think the right fix is to change all the cases of "UTextWithBufferInlineCapacity + 1" to just "UTextWithBufferInlineCapacity".  Otherwise, I am concerned that our iterator math will be wrong in some cases resulting in undefined behavior.

The only concern I have with my suggestion is that there are uses where the UTextWithBuffer client assumes that the "UTextWithBufferInlineCapacity" is the number of valid characters, with an implicit extra null "byte" at the end.
Comment 9 Brent Fulgham 2014-01-22 10:39:17 PST
Thank you for tracking down the cause of this problem. After talking your patch over with a couple of the other people on the team, I *think* the correct solution is to remove the "+1" from the various places where UTextWithBufferInlineCapacity is used.

Note also that this method:
static UText* uTextLatin1Clone(UText* destination, const UText* source, UBool deep, UErrorCode* status)

calls utext_setup with a +1, unlike all other uses of utext_setup in the code base.

Looks like you found a nasty little off-by-one we've been living with for some time!
Comment 10 peavo 2014-01-22 10:47:40 PST
(In reply to comment #9)
> Thank you for tracking down the cause of this problem. After talking your patch over with a couple of the other people on the team, I *think* the correct solution is to remove the "+1" from the various places where UTextWithBufferInlineCapacity is used.
> 
> Note also that this method:
> static UText* uTextLatin1Clone(UText* destination, const UText* source, UBool deep, UErrorCode* status)
> 
> calls utext_setup with a +1, unlike all other uses of utext_setup in the code base.
> 
> Looks like you found a nasty little off-by-one we've been living with for some time!

Thanks for looking into this :) Will update the patch.
Comment 11 peavo 2014-01-22 11:00:17 PST
Created attachment 221875 [details]
Patch
Comment 12 peavo 2014-01-22 11:06:25 PST
(In reply to comment #8)
> I think the right fix is to change all the cases of "UTextWithBufferInlineCapacity + 1" to just "UTextWithBufferInlineCapacity".  Otherwise, I am concerned that our iterator math will be wrong in some cases resulting in undefined behavior.

Updated the patch as you suggested, works nicely :)
Comment 13 Brent Fulgham 2014-01-22 11:22:57 PST
Comment on attachment 221875 [details]
Patch

Looks great! r=me
Comment 14 WebKit Commit Bot 2014-01-22 11:50:46 PST
Comment on attachment 221875 [details]
Patch

Clearing flags on attachment: 221875

Committed r162544: <http://trac.webkit.org/changeset/162544>
Comment 15 WebKit Commit Bot 2014-01-22 11:50:49 PST
All reviewed patches have been landed.  Closing bug.
Comment 16 Alexey Proskuryakov 2014-01-22 13:02:39 PST
*** Bug 127408 has been marked as a duplicate of this bug. ***