Bug 202356

Summary: Add some assertions to convertUTF8ToUTF16().
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, benjamin, cdumez, cmarcelo, commit-queue, dbates, ews-watchlist, fpizlo, saam, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
proposed patch.
saam: review+
proposed patch. none

Description Mark Lam 2019-09-29 16:10:19 PDT
<rdar://problem/52846813>
Comment 1 Mark Lam 2019-09-29 16:21:30 PDT
Created attachment 379820 [details]
proposed patch.
Comment 2 Saam Barati 2019-09-29 16:30:58 PDT
Comment on attachment 379820 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=379820&action=review

> Source/WTF/wtf/unicode/UTF8Conversion.cpp:95
> +    RELEASE_ASSERT(targetEnd - target <= std::numeric_limits<int>::max());

why does the code require this?
Comment 3 Mark Lam 2019-09-29 17:18:40 PDT
Comment on attachment 379820 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=379820&action=review

Thanks for the review.

>> Source/WTF/wtf/unicode/UTF8Conversion.cpp:95
>> +    RELEASE_ASSERT(targetEnd - target <= std::numeric_limits<int>::max());
> 
> why does the code require this?

For the same reason as why there's a "RELEASE_ASSERT(sourceEnd - source <= std::numeric_limits<int>::max())" above: to validate that the client didn't pass in a targetEnd that is at a lower address than the targetStart.
Comment 4 Mark Lam 2019-09-29 17:20:27 PDT
Surprisingly, looks like these assertions are failing on some tests.  I'm investigating.
Comment 5 Mark Lam 2019-09-29 18:38:03 PDT
(In reply to Mark Lam from comment #4)
> Surprisingly, looks like these assertions are failing on some tests.  I'm
> investigating.

Turns out, it's too aggressive to assert that (targetEnd - target <= std::numeric_limits<int>::max()). Some clients actually did the work to calculate the exact number of UChars needed, not just the upper bound.  We should not penalize these clients.  I'll upload an updated patch with a better assertion.
Comment 6 Mark Lam 2019-09-29 18:42:52 PDT
Created attachment 379821 [details]
proposed patch.

Let's try this on the EWS first.
Comment 7 Mark Lam 2019-09-30 10:31:38 PDT
Comment on attachment 379821 [details]
proposed patch.

Thanks for the review.  Landing now.
Comment 8 WebKit Commit Bot 2019-09-30 11:15:34 PDT
Comment on attachment 379821 [details]
proposed patch.

Clearing flags on attachment: 379821

Committed r250520: <https://trac.webkit.org/changeset/250520>
Comment 9 WebKit Commit Bot 2019-09-30 11:15:36 PDT
All reviewed patches have been landed.  Closing bug.
Comment 10 Saam Barati 2019-09-30 11:27:58 PDT
(In reply to Mark Lam from comment #3)
> Comment on attachment 379820 [details]
> proposed patch.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=379820&action=review
> 
> Thanks for the review.
> 
> >> Source/WTF/wtf/unicode/UTF8Conversion.cpp:95
> >> +    RELEASE_ASSERT(targetEnd - target <= std::numeric_limits<int>::max());
> > 
> > why does the code require this?
> 
> For the same reason as why there's a "RELEASE_ASSERT(sourceEnd - source <=
> std::numeric_limits<int>::max())" above: to validate that the client didn't
> pass in a targetEnd that is at a lower address than the targetStart.

This isn't quite right. You were halving the output rang.
Comment 11 Mark Lam 2019-09-30 12:03:19 PDT
(In reply to Saam Barati from comment #10)
> (In reply to Mark Lam from comment #3)
> > >> Source/WTF/wtf/unicode/UTF8Conversion.cpp:95
> > >> +    RELEASE_ASSERT(targetEnd - target <= std::numeric_limits<int>::max());
> > > 
> > > why does the code require this?
> > 
> > For the same reason as why there's a "RELEASE_ASSERT(sourceEnd - source <=
> > std::numeric_limits<int>::max())" above: to validate that the client didn't
> > pass in a targetEnd that is at a lower address than the targetStart.
> 
> This isn't quite right. You were halving the output rang.

I clarified this with Saam offline: I am not halving the output range because target and targetEnd are in units of UChar*.  Hence, this assertion is correct.