Created attachment 106412 [details] Repro Chromium: https://code.google.com/p/chromium/issues/detail?id=95486 Repro: <style> * { -webkit-locale: 'xx-_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'; -webkit-text-security: disc; } </style> xx There seem to be a number of problems here in WebKit and icu that combine to cause this: - A missing NULL check in acquireLineBreakIterator, - A hardcoded size limit for a buffer in ures_open, - Code that fills a buffer completely, leaving no room for a terminating NULL in _canonicalize. Source code for WebCore::acquireLineBreakIterator [webkit\source\webcore\platform\text\textbreakiteratoricu.cpp] -------------- TextBreakIterator* acquireLineBreakIterator(const UChar* string, int length, const AtomicString& locale) { UBreakIterator* iterator = LineBreakIteratorPool::sharedPool().take(locale); /// RETURN NULL UErrorCode setTextStatus = U_ZERO_ERROR; ubrk_setText(iterator, string, length, &setTextStatus); /// NULL ptr crash if (U_FAILURE(setTextStatus)) { LOG_ERROR("ubrk_setText failed with status %d", setTextStatus); return 0; } return reinterpret_cast<TextBreakIterator*>(iterator); } -------------- The code path that sets iterator to NULL is: WebCore::LineBreakIteratorPool::take [webkit\source\webcore\platform\text\linebreakiteratorpoolicu.h] ubrk_open [icu\source\common\ubrk.cpp] BreakIterator::createLineInstance [icu\source\common\brkiter.cpp] BreakIterator::createInstance [icu\source\common\brkiter.cpp] BreakIterator::makeInstance [icu\source\common\brkiter.cpp] BreakIterator::buildInstance [icu\source\common\brkiter.cpp] ures_open [icu\source\common\uresbund.c] Parital source code for ures_open: --------------- U_CAPI UResourceBundle* U_EXPORT2 ures_open(const char* path, const char* localeID, UErrorCode* status) { char canonLocaleID[100]; UResourceDataEntry *hasData = NULL; UResourceBundle *r; if(status == NULL || U_FAILURE(*status)) { return NULL; } /* first "canonicalize" the locale ID */ uloc_getBaseName(localeID, canonLocaleID, sizeof(canonLocaleID), status); /// SEE BELOW /// if(U_FAILURE(*status) || *status == U_STRING_NOT_TERMINATED_WARNING) { *status = U_ILLEGAL_ARGUMENT_ERROR; return NULL; } -------------- The max locale size for uloc_getBaseName is limited to 100 by the hardcoded "canonLocaleID" buffer. It might be wise to switch to a dynamically allocated buffer here, if possible? Continuing down the code path: uloc_getBaseName [icu\source\common\uloc.c] _canonicalize [icu\source\common\uloc.c] Parital source code for _canonicalize: --------------- return u_terminateChars(result, resultCapacity, len, err); --------------- "resultCapacity" is the hardcoded 100 limit mentioned above. The locale we provided is 100 chars, so result if filled with 100 chars in the code before this return statement. It might be wise for the code to not fill the entire buffer, but leave some space for the terminate chars? Otherwise, all our efforts have been in vain. Continuing down the code path: u_terminateChars [icu\source\common\ustring.c] Source code for u_terminateChars: ---------------- U_CAPI int32_t U_EXPORT2 u_terminateChars(char *dest, int32_t destCapacity, int32_t length, UErrorCode *pErrorCode) { __TERMINATE_STRING(dest, destCapacity, length, pErrorCode); return length; } ---------------- #define __TERMINATE_STRING(dest, destCapacity, length, pErrorCode) \ if(pErrorCode!=NULL && U_SUCCESS(*pErrorCode)) { \ /* not a public function, so no complete argument checking */ \ \ if(length<0) { \ /* assume that the caller handles this */ \ } else if(length<destCapacity) { \ /* NUL-terminate the string, the NUL fits */ \ dest[length]=0; \ /* unset the not-terminated warning but leave all others */ \ if(*pErrorCode==U_STRING_NOT_TERMINATED_WARNING) { \ *pErrorCode=U_ZERO_ERROR; \ } \ } else if(length==destCapacity) { \ /* unable to NUL-terminate, but the string itself fit - set a warning code */ \ *pErrorCode=U_STRING_NOT_TERMINATED_WARNING; \ } else /* length>destCapacity */ { \ /* even the string itself did not fit - set an error code */ \ *pErrorCode=U_BUFFER_OVERFLOW_ERROR; \ } \ } ---------------- Since length is 100 and destCapacity is also 100, pErrorCode is set to U_STRING_NOT_TERMINATED_WARNING. Going back up the code path, from ures_open onwards, functions will return NULL leading to a NULL pointer in ubrk_setText. ubrk_setText [icu\source\common\ubrk.cpp] ---------------- U_CAPI void U_EXPORT2 ubrk_setText(UBreakIterator* bi, const UChar* text, int32_t textLength, UErrorCode* status) { BreakIterator *brit = (BreakIterator *)bi; //// bi and brit are NULL UText ut = UTEXT_INITIALIZER; utext_openUChars(&ut, text, textLength, status); brit->setText(&ut, *status); //// brit is NULL -> crash // A stack allocated UText wrapping a UCHar * string // can be dumped without explicitly closing it. } It might make sense to add a NULL check here.
For builds that cannot patch ICU, we'd better add a null check to TextBreakIteratorICU.cpp. What layout test should we add? The attachment here would suffice?
I have a simple patch.
The ICU-side of change I made is not acceptable because it's not in line with ICU C API convention about 'NULL' check. See http://bugs.icu-project.org/trac/ticket/9115 I'll upload a webkit change to better handle this situation.
Created attachment 128604 [details] patch for review
Created attachment 128609 [details] patch with a missing file added
This bug affects all webkit ports that rely on ICU including Safari on Mac. mitz, if you can review it, it'll be greatly appreciated.
Attachment 128609 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 LayoutTests/ChangeLog:4: Line contains tab character. [whitespace/tab] [5] LayoutTests/ChangeLog:5: Line contains tab character. [whitespace/tab] [5] LayoutTests/ChangeLog:6: Line contains tab character. [whitespace/tab] [5] Source/WebCore/ChangeLog:4: Line contains tab character. [whitespace/tab] [5] Source/WebCore/ChangeLog:5: Line contains tab character. [whitespace/tab] [5] Source/WebCore/ChangeLog:6: Line contains tab character. [whitespace/tab] [5] Total errors found: 6 in 8 files If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 128825 [details] patch (fix style nits)
Darin, can you take a look in case mitz cannot? Thanks
Comment on attachment 128825 [details] patch (fix style nits) View in context: https://bugs.webkit.org/attachment.cgi?id=128825&action=review Looks good, r- because I think you’re not handling an error case that was previously handled. > Source/WebCore/platform/text/LineBreakIteratorPoolICU.h:64 > + bool isLocaleEmpty = locale.isEmpty(); I think the question phrase “isLocaleEmpty” isn’t a great name for this variable. You could call it “localeIsEmpty”. > Source/WebCore/platform/text/LineBreakIteratorPoolICU.h:67 > + // locale comes from a web page and it can be invalid leading ICU > + // to fail, in which case we fall back to the default locale. You’re missing a comma after “invalid”. > Source/WebCore/platform/text/LineBreakIteratorPoolICU.h:75 > + if (U_FAILURE(openStatus)) { > + LOG_ERROR("ubrk_open failed with status %d", openStatus); > + return 0; > + } This should go outside the “if (!isLocalEmpty && U_FAILURE(openStatus))” block, otherwise we’re not handling the case of ubrk_open failing with the currentTextBreakLocaleID() when locale *is* empty.
Created attachment 129290 [details] patch updated per mitz's comments mitz, thanks for the review. Can you take another look?
Comment on attachment 129290 [details] patch updated per mitz's comments r=me!
Landed on Feb 28 as http://trac.webkit.org/changeset/109144