Bug 213702

Summary: Add handling for a case of OOME in CSSTokenizer and CSSParser.
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: CSSAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: annulen, benjamin, cdumez, cmarcelo, darin, esprehn+autocc, ews-watchlist, ggaren, glenn, gyuyoung.kim, macpherson, menard, ryuan.choi, sergio, webkit-bug-importer, ysuzuki
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
proposed patch.
none
proposed patch.
none
proposed patch.
darin: review+
patch for landing.
none
patch for landing.
none
patch for landing. none

Mark Lam
Reported 2020-06-28 13:12:27 PDT
Attachments
proposed patch. (37.14 KB, patch)
2020-06-28 13:22 PDT, Mark Lam
no flags
proposed patch. (37.29 KB, patch)
2020-06-28 14:11 PDT, Mark Lam
no flags
proposed patch. (40.46 KB, patch)
2020-06-29 16:42 PDT, Mark Lam
darin: review+
patch for landing. (40.35 KB, patch)
2020-06-29 22:22 PDT, Mark Lam
no flags
patch for landing. (40.28 KB, patch)
2020-06-30 00:07 PDT, Mark Lam
no flags
patch for landing. (40.47 KB, patch)
2020-06-30 13:25 PDT, Mark Lam
no flags
Mark Lam
Comment 1 2020-06-28 13:22:02 PDT
Created attachment 403007 [details] proposed patch.
Mark Lam
Comment 2 2020-06-28 14:11:34 PDT
Created attachment 403010 [details] proposed patch.
Darin Adler
Comment 3 2020-06-28 14:29:28 PDT
Comment on attachment 403010 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=403010&action=review > Source/WTF/wtf/AllocationTraits.h:30 > +enum class AllocationTraits { This name seems too broad. This is only about what to do when allocation fails due to exhaustion, not other traits (categories of examples: whether to prioritize memory use cost over performance, whether memory should be initialized, what kind of alignment is needed, whether the memory is to be freed a block at a time or by zone, for that matter what zone should be used, etc.). I would call this something more like AllocationFailureAction::Crash and AllocationFailureAction::Report. Or maybe Policy instead of Action? Or maybe MemoryExhaustionAction? It’s worth thinking through what’s the clearest term for this. Might even end up being a shorter phrase, and easier to read at call sites. > Source/WebCore/css/parser/CSSParserImpl.h:70 > + bool failed() const; I’m not thrilled with this. When are we suppose to check "failed"? If only at construction time, then I suggest we change things so there is a create function and we return nullptr if the creation failed. Otherwise we have this "failed CSSParserImpl" where it’s ambiguous what functions are OK to call on it. And what guarantees people remember to check failed()? I think it’s much better to build this into the process of creating a CSSParserImpl in a way that’s harder to ignore. > Source/WebCore/css/parser/CSSTokenizer.h:49 > CSSTokenizer(const String&, CSSParserObserverWrapper&); // For the inspector I don’t see failure handling in the user of this constructor? Do we not need it? > Source/WebCore/css/parser/CSSTokenizer.h:51 > + bool failed() const { return m_failed; } Same comment as CSSParserImpl. We should build this into the process of construction rather than having this thing called a "failed CSSTokenizer" where it’s illegal to use it for anything. Instead we should have a create function that returns nullptr. Note also that this failed state is not valuable during the lifetime of the tokenizer, only during its construction.
Geoffrey Garen
Comment 4 2020-06-28 17:53:02 PDT
What is the CSS parsing size limit in other browsers? Seems like we should try to match them. (You can find out by expanding the test case to include a supported syntax, and then repeating that syntax exponentially more times until failure.)
Mark Lam
Comment 5 2020-06-29 13:30:14 PDT
(In reply to Geoffrey Garen from comment #4) > What is the CSS parsing size limit in other browsers? Seems like we should > try to match them. > > (You can find out by expanding the test case to include a supported syntax, > and then repeating that syntax exponentially more times until failure.) For Firefox Nightly, the string length reached ~0x3FFFFFF9, and CSS.supports() still returns true. Thereafter, we get an `InternalError: allocation size overflow` from failing to make a longer string. For Chrome Canary, the string length reached ~0x77FFFFD (125829117), and CSS.supports() still returns true. Trying to double the string length thereafter causes the Chrome page to crash "Aw, Snap! Something went wrong while displaying this webpage." Base on these results, I don't think we should worry about match Chrome or FF's behavior. We should just make our own work.
Mark Lam
Comment 6 2020-06-29 13:52:49 PDT
Also, I think throwing an OOME would have been the most elegant solution i.e. we would never have to worry about having a CSS tokenizer in an inconsistent state. But if we don't want to OOMEs here, well, then I'll need two figure out what all the failure modes would be for all the various clients of the CSSTokenizer, which I think is the crux of the issue that Darin brought up.
Mark Lam
Comment 7 2020-06-29 16:42:03 PDT
Created attachment 403136 [details] proposed patch.
Mark Lam
Comment 8 2020-06-29 16:51:25 PDT
I didn't want to have to figure out the failure modes for all the other clients right now. So, I opted to let them continue crashing on error detection as they do right now. This patch only fixes the case for CSS.support() as originally intended.
Darin Adler
Comment 9 2020-06-29 17:25:00 PDT
Comment on attachment 403136 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=403136&action=review I like this, better approach. > Source/WebCore/css/parser/CSSParser.cpp:93 > CSSParserImpl parser(m_context, condition); > + if (!parser.tokenizer()) > + return false; Is this the only place we construct a CSSParserImpl? > Source/WebCore/css/parser/CSSTokenizer.cpp:54 > + auto tokenizer = makeUnique<CSSTokenizer>(string, &success); I think this should use (preprocessString(string), &wrapper, nullptr) so it uses the inner constructor. > Source/WebCore/css/parser/CSSTokenizer.cpp:63 > + auto tokenizer = makeUnique<CSSTokenizer>(string, wrapper, &success); I think this should use (preprocessString(string), &wrapper, &success) so it uses the inner constructor. > Source/WebCore/css/parser/CSSTokenizer.cpp:79 > +inline CSSTokenizer::CSSTokenizer(String&& string, CSSParserObserverWrapper* wrapper, bool* constructionSuccessPtr) Should get rid of this "inline"; does no good. Doesn’t even affect inlining. > Source/WebCore/css/parser/CSSTokenizer.cpp:84 > + bool localConstructionSuccess; > + bool& constructionSuccess = constructionSuccessPtr ? *constructionSuccessPtr : localConstructionSuccess; > + constructionSuccess = true; This is unnecessarily complicated; the local variable doesn’t make the code any clearer. Just write: if (constructionSuccessPtr) *constructionSuccessPtr = true; And use constructionSuccessPtr below, in the two places it's used, right after the assertions. > Source/WebCore/css/parser/CSSTokenizer.cpp:93 > + constexpr bool reserveInitialCapacitySucceeded = false; > + RELEASE_ASSERT(constructionSuccessPtr || reserveInitialCapacitySucceeded); Here’s how I would write this: // When constructionSuccessPtr is null, our policy is to crash on failure. RELEASE_ASSERT(constructionSuccessPtr); The strange constexpr boolean does not work for me as a comment. > Source/WebCore/css/parser/CSSTokenizer.cpp:109 > + constexpr bool appendSucceeded = false; > + RELEASE_ASSERT(constructionSuccessPtr || appendSucceeded); Ditto. > Source/WebCore/css/parser/CSSTokenizer.h:49 > + static std::unique_ptr<CSSTokenizer> create(const String&); > + static std::unique_ptr<CSSTokenizer> create(const String&, CSSParserObserverWrapper&); // For the inspector I guess since the default is you can just construct these, then these should be called tryCreate. > Source/WebCore/css/parser/CSSTokenizer.h:52 > + explicit CSSTokenizer(const String&, bool* constructionSuccess = nullptr); > + CSSTokenizer(const String&, CSSParserObserverWrapper&, bool* constructionSuccess = nullptr); // For the inspector I suggest not adding bool* constructionSuccess to these. We don’t want that to be public. The tryCreate functions take care of that. And it’s simple to modify the implementation of the tryCreate functions so they use the private constructor.
Darin Adler
Comment 10 2020-06-29 17:26:16 PDT
Comment on attachment 403136 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=403136&action=review I’m not entirely sure this is the correct strategy; typically we try to avoid spreading out of memory checks across so much code. But I don’t see an alternative at the moment. > Source/WTF/ChangeLog:51 > + (WTF::Malloc>::expandCapacity): > + (WTF::Malloc>::resize): > + (WTF::Malloc>::grow): > + (WTF::Malloc>::reserveCapacity): > + (WTF::Malloc>::reserveInitialCapacity): > + (WTF::Malloc>::append): > + (WTF::Malloc>::constructAndAppend): > + (WTF::Malloc>::appendSlowCase): > + (WTF::Malloc>::constructAndAppendSlowCase): > + (WTF::Malloc>::appendVector): > + (WTF::Malloc>::insert): > + (WTF::Malloc>::tryExpandCapacity): Deleted. > + (WTF::Malloc>::tryReserveCapacity): Deleted. > + (WTF::Malloc>::tryAppend): Deleted. > + (WTF::Malloc>::tryConstructAndAppend): Deleted. > + (WTF::Malloc>::tryConstructAndAppendSlowCase): Deleted. These function names all seem to be garbled.
Darin Adler
Comment 11 2020-06-29 17:31:30 PDT
Here’s what I mean when I say I am not sure this is the correct strategy. If we are processing a large CSS then we might get really close to memory exhaustion with all the various data structures including the vector. If we do get close, one of the smaller allocations might be the one that fails, rather than the allocations done by Vector. So all this work to make Vector itself smart might be insufficient; we might end up crashing instead of failing gracefully. I don’t know what guarantees that Vector is the only large data structure. And if it currently is, what guarantees it will remain that way. All we’d have to do is add a hash table or any other data structure and this work might be insufficient.
Mark Lam
Comment 12 2020-06-29 20:29:37 PDT
Thanks for the review. I'll address the rest of your feedback shortly, but thought I'd address the following one first. (In reply to Darin Adler from comment #11) > Here’s what I mean when I say I am not sure this is the correct strategy. If > we are processing a large CSS then we might get really close to memory > exhaustion with all the various data structures including the vector. If we > do get close, one of the smaller allocations might be the one that fails, > rather than the allocations done by Vector. So all this work to make Vector > itself smart might be insufficient; we might end up crashing instead of > failing gracefully. I don’t know what guarantees that Vector is the only > large data structure. And if it currently is, what guarantees it will remain > that way. All we’d have to do is add a hash table or any other data > structure and this work might be insufficient. The prevailing practice is that we do our best to handle OOMEs where (1) it's possible to do so without incurring too much code complexity. For example, a patch that requires changing a wide-spread graph of code to return error codes would be deemed too much complexity. Or (2) where the benefit of doing so would be very high. In other words, we take an only best effort approach, and we do these gradually as the opportunity to fix them arises. For the rest, we just let them crash on allocation failures. This specific patch falls into case (1). The changes to Vector may be slightly complex, but I think it's well contained and easy to understand. Therefore, the complexity is acceptable. These changes may also make it more convenient to fix other OOME issues later without less effort too. It also has the benefit of making the code more robust against the class of bugs that can arise from code duplication leading to opportunities for divergence between duplicated code paths. So, that makes the change worthwhile IMHO. It's understood that there will be other OOME cases which can still lead to crashes.
Mark Lam
Comment 13 2020-06-29 21:17:47 PDT
Comment on attachment 403136 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=403136&action=review Thanks again for the review. >> Source/WTF/ChangeLog:51 >> + (WTF::Malloc>::tryConstructAndAppendSlowCase): Deleted. > > These function names all seem to be garbled. This mangling came from template arguments e.g. Vector<T, inlineCapacity, OverflowHandler, minCapacity, Malloc>::expandCapacity. I'll fix up the ChangeLog manually. >> Source/WebCore/css/parser/CSSParser.cpp:93 >> + return false; > > Is this the only place we construct a CSSParserImpl? It is not. However, the existing places that construct CSSParserImpl will in some cases use the underlying tokenizer, and some will not. The cases that do not use the tokenizer will continue to work just as they do today. The case that do use the tokenizer always start by dereferencing the tokenizer before they do anything else with the CSSParserImpl. Hence, that case will crash as it does today if the tokenizer cannot be fulling initialized. So, there's no change from the status quo. >> Source/WebCore/css/parser/CSSTokenizer.cpp:54 >> + auto tokenizer = makeUnique<CSSTokenizer>(string, &success); > > I think this should use (preprocessString(string), &wrapper, nullptr) so it uses the inner constructor. Fixed. >> Source/WebCore/css/parser/CSSTokenizer.cpp:63 >> + auto tokenizer = makeUnique<CSSTokenizer>(string, wrapper, &success); > > I think this should use (preprocessString(string), &wrapper, &success) so it uses the inner constructor. Fixed. >> Source/WebCore/css/parser/CSSTokenizer.cpp:79 >> +inline CSSTokenizer::CSSTokenizer(String&& string, CSSParserObserverWrapper* wrapper, bool* constructionSuccessPtr) > > Should get rid of this "inline"; does no good. Doesn’t even affect inlining. Fixed. >> Source/WebCore/css/parser/CSSTokenizer.cpp:84 >> + constructionSuccess = true; > > This is unnecessarily complicated; the local variable doesn’t make the code any clearer. Just write: > > if (constructionSuccessPtr) > *constructionSuccessPtr = true; > > And use constructionSuccessPtr below, in the two places it's used, right after the assertions. Fixed. >> Source/WebCore/css/parser/CSSTokenizer.cpp:93 >> + RELEASE_ASSERT(constructionSuccessPtr || reserveInitialCapacitySucceeded); > > Here’s how I would write this: > > // When constructionSuccessPtr is null, our policy is to crash on failure. > RELEASE_ASSERT(constructionSuccessPtr); > > The strange constexpr boolean does not work for me as a comment. In addition to being used as a comment, I had intended to use the bool as a way to make the assertion failure message a little more descriptive: ASSERTION FAILED: constructionSuccessPtr || reserveInitialCapacitySucceeded Just "ASSERTION FAILED: constructionSuccessPtr" is not very informative. That said, I just realize that on Release builds, we don't produce this failure message: we just crash. Hence, there's no added value for Release builds. For Debug builds, we will print the file and line number anyway. Without any Release build benefits, I don't feel a strong need for this anymore. Anyway, fixed. >> Source/WebCore/css/parser/CSSTokenizer.cpp:109 >> + RELEASE_ASSERT(constructionSuccessPtr || appendSucceeded); > > Ditto. Fixed. >> Source/WebCore/css/parser/CSSTokenizer.h:49 >> + static std::unique_ptr<CSSTokenizer> create(const String&, CSSParserObserverWrapper&); // For the inspector > > I guess since the default is you can just construct these, then these should be called tryCreate. Fixed. >> Source/WebCore/css/parser/CSSTokenizer.h:52 >> + CSSTokenizer(const String&, CSSParserObserverWrapper&, bool* constructionSuccess = nullptr); // For the inspector > > I suggest not adding bool* constructionSuccess to these. We don’t want that to be public. The tryCreate functions take care of that. And it’s simple to modify the implementation of the tryCreate functions so they use the private constructor. I had thought that this might leave the door open for other clients to choose to handle the construction failure error too later instead of crashing. But it's not too hard to re-add this when they choose to do so. Fixed.
Mark Lam
Comment 14 2020-06-29 22:22:45 PDT
Created attachment 403175 [details] patch for landing. Let's try this on the EWS while I also try to test for the reported EWS failure locally.
Mark Lam
Comment 15 2020-06-30 00:00:30 PDT
Comment on attachment 403175 [details] patch for landing. View in context: https://bugs.webkit.org/attachment.cgi?id=403175&action=review > Source/WebCore/css/parser/CSSParserImpl.cpp:82 > + if (m_tokenizer) > + return; It turns out that this m_tokenizer null check is incorrect. Apparently, there are normal circumstances where CSSParserImpl is constructed with a null m_tokenizer, but we expect to still create the m_deferredParser below. Note: there are some clients that don't use the m_tokenizer, and there are some clients that don't. For the clients that don't, m_tokenizer can be null but these clients may rely on the m_deferredParser below. This is the source of the EWS failures. Hence, I'll remove this null check with the early return.
Mark Lam
Comment 16 2020-06-30 00:07:58 PDT
Created attachment 403179 [details] patch for landing.
Darin Adler
Comment 17 2020-06-30 12:27:44 PDT
Comment on attachment 403175 [details] patch for landing. View in context: https://bugs.webkit.org/attachment.cgi?id=403175&action=review >> Source/WebCore/css/parser/CSSParserImpl.cpp:82 >> + return; > > It turns out that this m_tokenizer null check is incorrect. Apparently, there are normal circumstances where CSSParserImpl is constructed with a null m_tokenizer, but we expect to still create the m_deferredParser below. Note: there are some clients that don't use the m_tokenizer, and there are some clients that don't. For the clients that don't, m_tokenizer can be null but these clients may rely on the m_deferredParser below. This is the source of the EWS failures. Hence, I'll remove this null check with the early return. This was written backwards. It says "if (m_tokenizer) return", not "if (!m_tokenizer) return".
Mark Lam
Comment 18 2020-06-30 12:46:44 PDT
Comment on attachment 403175 [details] patch for landing. View in context: https://bugs.webkit.org/attachment.cgi?id=403175&action=review >>> Source/WebCore/css/parser/CSSParserImpl.cpp:82 >>> + return; >> >> It turns out that this m_tokenizer null check is incorrect. Apparently, there are normal circumstances where CSSParserImpl is constructed with a null m_tokenizer, but we expect to still create the m_deferredParser below. Note: there are some clients that don't use the m_tokenizer, and there are some clients that don't. For the clients that don't, m_tokenizer can be null but these clients may rely on the m_deferredParser below. This is the source of the EWS failures. Hence, I'll remove this null check with the early return. > > This was written backwards. It says "if (m_tokenizer) return", not "if (!m_tokenizer) return". Face palm! Thanks for pointing that out. I'll investigate further.
Mark Lam
Comment 19 2020-06-30 13:25:41 PDT
Created attachment 403236 [details] patch for landing.
Mark Lam
Comment 20 2020-06-30 14:12:36 PDT
(In reply to Darin Adler from comment #17) > Comment on attachment 403175 [details] > patch for landing. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=403175&action=review > > >> Source/WebCore/css/parser/CSSParserImpl.cpp:82 > >> + return; > > > > It turns out that this m_tokenizer null check is incorrect. Apparently, there are normal circumstances where CSSParserImpl is constructed with a null m_tokenizer, but we expect to still create the m_deferredParser below. Note: there are some clients that don't use the m_tokenizer, and there are some clients that don't. For the clients that don't, m_tokenizer can be null but these clients may rely on the m_deferredParser below. This is the source of the EWS failures. Hence, I'll remove this null check with the early return. > > This was written backwards. It says "if (m_tokenizer) return", not "if > (!m_tokenizer) return". I re-instated the null check. Tests are passing. I'll land this patch shortly.
Mark Lam
Comment 21 2020-06-30 14:17:42 PDT
Note You need to log in before you can comment on or make changes to this bug.