Currently we send data URLs to networking layer for decoding. This involves long roundtrip through various processes and API layers.
rdar://problem/12858179
Created attachment 259282 [details] patch
Attachment 259282 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/DataURLDecoder.cpp:49: Extra space before ( in function call [whitespace/parens] [4] ERROR: Source/WebCore/platform/network/DataURLDecoder.cpp:131: Extra space before ( in function call [whitespace/parens] [4] ERROR: Source/WebCore/platform/network/DataURLDecoder.h:46: Extra space before ( in function call [whitespace/parens] [4] Total errors found: 3 in 9 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 259282 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=259282&action=review > Source/WebCore/platform/network/DataURLDecoder.cpp:61 > + ASSERT(urlString.startsWith(dataString)); Shouldn't this be case insensitive? > Source/WebCore/platform/network/DataURLDecoder.cpp:63 > + size_t mediaTypeEnd = urlString.find(base64String); ditto
Comment on attachment 259282 [details] patch Attachment 259282 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/72522 New failing tests: contentfiltering/block-after-response-then-deny-unblock.html svg/text/text-default-font-size.html contentfiltering/block-after-response.html contentfiltering/block-after-response-then-allow-unblock.html editing/selection/find-yensign-and-backslash.html imported/mozilla/svg/filters/feSpecularLighting-1.svg
Created attachment 259286 [details] Archive of layout-test-results from ews103 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-mavericks Platform: Mac OS X 10.9.5
Comment on attachment 259282 [details] patch Attachment 259282 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/72521 New failing tests: editing/selection/find-yensign-and-backslash.html imported/mozilla/svg/filters/feSpecularLighting-1.svg svg/text/text-default-font-size.html
Created attachment 259287 [details] Archive of layout-test-results from ews105 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews105 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Comment on attachment 259282 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=259282&action=review > Source/WebCore/loader/ResourceLoader.cpp:228 > + return; Will that make loading data uri resources synchronous? If yes then I think this is going to be problematic. Please see comments in https://bugs.webkit.org/show_bug.cgi?id=99677. Also if you stick with this approach, you may want to roll out http://trac.webkit.org/changeset/179626 since it will not be needed anymore.
Comment on attachment 259282 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=259282&action=review >> Source/WebCore/loader/ResourceLoader.cpp:228 >> + return; > > Will that make loading data uri resources synchronous? If yes then I think this is going to be problematic. Please see comments in https://bugs.webkit.org/show_bug.cgi?id=99677. Also if you stick with this approach, you may want to roll out http://trac.webkit.org/changeset/179626 since it will not be needed anymore. DataURLDecoder::decode() does the decoding in another thread so it will no be synchronous.
Comment on attachment 259282 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=259282&action=review >>> Source/WebCore/loader/ResourceLoader.cpp:228 >>> + return; >> >> Will that make loading data uri resources synchronous? If yes then I think this is going to be problematic. Please see comments in https://bugs.webkit.org/show_bug.cgi?id=99677. Also if you stick with this approach, you may want to roll out http://trac.webkit.org/changeset/179626 since it will not be needed anymore. > > DataURLDecoder::decode() does the decoding in another thread so it will no be synchronous. So who is firing the onload event for data uri resources in this approach?
Comment on attachment 259282 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=259282&action=review > Source/WebCore/platform/network/DataURLDecoder.cpp:153 > + RunLoop::main().dispatch([decodeTaskPtr, success] { Does this work correctly when the WebThread is in use? As far as I can tell, RunLoop::initializeMainRunLoop() is never called in that case.
Created attachment 259369 [details] patch
Attachment 259369 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/DataURLDecoder.cpp:49: Extra space before ( in function call [whitespace/parens] [4] ERROR: Source/WebCore/platform/network/DataURLDecoder.cpp:129: Extra space before ( in function call [whitespace/parens] [4] ERROR: Source/WebCore/platform/network/DataURLDecoder.h:46: Extra space before ( in function call [whitespace/parens] [4] Total errors found: 3 in 10 files If any of these errors are false positives, please file a bug against check-webkit-style.
> Does this work correctly when the WebThread is in use? As far as I can tell, > RunLoop::initializeMainRunLoop() is never called in that case. No it won't. Maybe we should initialize RunLoop::main() to be the web thread runloop on iOS WK1? Or maybe add a WebCore level global function somewhere for getting the web RunLoop/WorkQueue?
> So who is firing the onload event for data uri resources in this approach? They are fired by whoever fires them currently.
Comment on attachment 259369 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=259369&action=review > Source/WebCore/platform/network/DataURLDecoder.h:31 > +#include <wtf/text/WTFString.h> Could use <wtf/Forward.h> instead. > Source/WebCore/platform/network/DataURLDecoder.h:46 > +void decode(const URL&, std::function<void (const Result*)>); Why a pointer rather than a reference? > Source/WebCore/platform/text/DecodeEscapeSequences.h:118 > - return (encoding.isValid() ? encoding : UTF8Encoding()).decode(buffer.data(), p - buffer.data()); > + if (encoding.isValid()) > + return encoding.decode(buffer.data(), p - buffer.data()); > + return String(buffer.data(), p - buffer.data()); What makes this behavior change OK? Maybe I am wrong, but I don’t think it’s OK! I have no objections to making a fast path which treats sequences as Latin-1 since that’s what we store in 8-bit WTF::String, but that’s not the same thing as treating sequences as UTF-8. The old contract was that we would treat the sequences as UTF-8 when no encoding was passed in.
> What makes this behavior change OK? Maybe I am wrong, but I don’t think it’s > OK! I have no objections to making a fast path which treats sequences as > Latin-1 since that’s what we store in 8-bit WTF::String, but that’s not the > same thing as treating sequences as UTF-8. The old contract was that we > would treat the sequences as UTF-8 when no encoding was passed in. I haven't fully confirmed but the idea is that it is not a behavior change because no one else is calling this with invalid encoder.
> Why a pointer rather than a reference? Null here indicates decode failure. Could use some other mechanism for that of course.
> I haven't fully confirmed but the idea is that it is not a behavior change > because no one else is calling this with invalid encoder. We need to do something along these lines because data urls expect bytes to get through unmodified even when not using base64. This kind of stuff is supposed to work: data:image/png,%89PNG%0D%0A%1A%0A%00%00%00%0D...
(In reply to comment #20) > We need to do something along these lines because data urls expect bytes to > get through unmodified even when not using base64. This kind of stuff is > supposed to work: > > data:image/png,%89PNG%0D%0A%1A%0A%00%00%00%0D... Yes, there’s no question that for data URLs we want to decode into bytes, not characters. I think it's a bit peculiar that we decode into a String rather than a SharedBuffer or something like that. My point was about the other callers of this API, not a doubt that this was correct for data URL handling.
Comment on attachment 259369 [details] patch Attachment 259369 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/75783 New failing tests: contentfiltering/block-after-response-then-deny-unblock.html contentfiltering/block-after-response.html contentfiltering/block-after-response-then-allow-unblock.html
Created attachment 259384 [details] Archive of layout-test-results from ews100 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-mavericks Platform: Mac OS X 10.9.5
> Yes, there’s no question that for data URLs we want to decode into bytes, > not characters. I think it's a bit peculiar that we decode into a String > rather than a SharedBuffer or something like that. Perhaps I should add a separate decodeDataURLEscapeSequences() that produces bytes directly. It will be very similar to the existing DecodeEscapeSequences stuff but maybe there is a way to minimize duplication.
Comment on attachment 259369 [details] patch Attachment 259369 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/75713 New failing tests: contentfiltering/block-after-response-then-deny-unblock.html contentfiltering/block-after-response.html contentfiltering/block-after-response-then-allow-unblock.html
Created attachment 259389 [details] Archive of layout-test-results from ews101 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-mavericks Platform: Mac OS X 10.9.5
Comment on attachment 259369 [details] patch Attachment 259369 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/76039 New failing tests: contentfiltering/block-after-response-then-deny-unblock.html contentfiltering/block-after-response.html contentfiltering/block-after-response-then-allow-unblock.html
Created attachment 259399 [details] Archive of layout-test-results from ews103 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-mavericks Platform: Mac OS X 10.9.5
Comment on attachment 259369 [details] patch Attachment 259369 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/76277 New failing tests: contentfiltering/block-after-response-then-deny-unblock.html contentfiltering/block-after-response.html contentfiltering/block-after-response-then-allow-unblock.html
Created attachment 259413 [details] Archive of layout-test-results from ews102 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews102 Port: mac-mavericks Platform: Mac OS X 10.9.5
Created attachment 259613 [details] patch
Attachment 259613 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/DataURLDecoder.h:47: Extra space before ( in function call [whitespace/parens] [4] Total errors found: 1 in 12 files If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 259618 [details] patch
Attachment 259618 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/DataURLDecoder.cpp:84: Should have only a single space after a punctuation in a comment. [whitespace/comments] [5] ERROR: Source/WebCore/platform/network/DataURLDecoder.h:47: Extra space before ( in function call [whitespace/parens] [4] Total errors found: 2 in 12 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 259618 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=259618&action=review Looks good to me. > Source/WebCore/loader/ResourceLoader.cpp:210 > + if (url.protocolIsData()) { I think the code inside this if statement is long enough that it could be clearer if factored into a separate private named member function, perhaps named loadDataURL or startLoadingDataURL or something. > Source/WebCore/loader/ResourceLoader.cpp:222 > + ResourceResponse dataResponse { url, result.mimeType, dataSize, result.charset }; If we had the right kind of constructor for ResourceResponse we could use WTF::move on all the arguments here to avoid a little bit of reference count churn. > Source/WebCore/platform/network/DataURLDecoder.cpp:53 > + DecodeTask(const String& urlString, StringView encodedData, bool isBase64, DecodeCompletionHandler completionHandler) > + : urlString(urlString) > + , encodedData(encodedData) > + , isBase64(isBase64) > + , completionHandler(completionHandler) > + { } I don’t think we need this constructor. We should just be able to write: std::make_unique<DecodeTask>({ ... }); below without using a constructor. > Source/WebCore/platform/network/DataURLDecoder.cpp:68 > + ASSERT(urlString.startsWith(dataString)); This check needs to ignore ASCII case. > Source/WebCore/platform/network/DataURLDecoder.cpp:72 > + if (headerEnd == notFound) > + return nullptr; Do we have a test case that covers this? > Source/WebCore/platform/network/DataURLDecoder.cpp:87 > + mimeType = "text/plain"; I think ASCIILiteral("text/plain") will be slightly more efficient. > Source/WebCore/platform/network/DataURLDecoder.cpp:89 > + charset = "US-ASCII"; I think ASCIILiteral("US-ASCII") will be slightly more efficient. > Source/WebCore/platform/network/DataURLDecoder.cpp:92 > + decodeTask->result.mimeType = mimeType; Please consider using WTF::move here to get rid of a tiny bit of reference count churn. > Source/WebCore/platform/network/DataURLDecoder.cpp:93 > + decodeTask->result.charset = charset; Please consider using WTF::move here to get rid of a tiny bit of reference count churn. > Source/WebCore/platform/network/DataURLDecoder.cpp:104 > + if (!base64URLDecode(task.encodedData.toStringWithoutCopying(), buffer)) { > + // Didn't work, try unescaping and decoding as base64. > + auto unescapedString = decodeURLEscapeSequences(task.encodedData.toStringWithoutCopying()); This toStringWithoutCopying thing seems a bit ugly, and could be avoided if the functions took StringView instead of const String& arguments. > Source/WebCore/platform/network/DataURLDecoder.cpp:116 > + auto buffer = decodeURLEscapeSequencesAsData(task.encodedData.toStringWithoutCopying(), encoding); Same thought about toStringWithoutCopying. > Source/WebCore/platform/network/DataURLDecoder.cpp:117 > + task.result.data = SharedBuffer::create(buffer.data(), buffer.size()); Why not use SharedBuffer::adoptVector here? Stray space here, after the "=". > Source/WebCore/platform/network/DataURLDecoder.cpp:147 > + decodeTask->completionHandler(decodeTask->result); Please consider using WTF::move here to get rid of a tiny bit of reference count churn. > Source/WebCore/platform/network/DataURLDecoder.h:49 > +void decode(const URL&, DecodeCompletionHandler); Don’t functions sometimes have state and are thus expensive to copy? If so, it might be nicer to take ownership of the completion handler rather than passing it by value. > Source/WebCore/platform/text/DecodeEscapeSequences.h:158 > +inline Vector<char> decodeURLEscapeSequencesAsData(const String& string, const TextEncoding& encoding) I think it would be better to have this take a StringView instead of a String. Should be a simple matter of changing the URLEscapeSequence functions to work on StringView. > Source/WebCore/platform/text/DecodeEscapeSequences.h:170 > + encodedRunPosition = length; I’m not sure this line of code is needed. We intentionally made encodedRunPosition a very large value so it doesn’t need to be explicitly checked for. Using StringView::substring with it should work because it clamps its arguments down to the length of the string. It’s even possible, depending on how URLEscapeSequence::findEndOfRun is written, that we could remove the if statement entirely. Maybe you don’t want to write code that relies on that. Maybe some day we’ll change the return values to be Optional<size_t> instead to make things like that more clear. > Source/WebCore/platform/text/DecodeEscapeSequences.h:181 > + auto stringFragment = StringView(string).substring(decodedPosition, encodedRunPosition - decodedPosition); > + auto encodedStringFragment = encoding.encode(StringView(string).substring(decodedPosition, encodedRunPosition - decodedPosition), URLEncodedEntitiesForUnencodables); Looks like when refactoring this you forgot to use the stringFragment local. I suggest either getting rid of it or using it. > Source/WebCore/platform/text/DecodeEscapeSequences.h:184 > + if (encodedRunPosition == length) If you make the change I suggested above this would be encodedRunPosition == notFound. > Source/WebCore/platform/text/DecodeEscapeSequences.h:185 > + return result; Should this be doing some kind of “shrink to fit” thing? > Source/WebCore/platform/text/DecodeEscapeSequences.h:195 > + ASSERT_NOT_REACHED(); > + return { }; I’m surprised this is needed. Don’t our compilers recognize the infinite loop? > Source/WebKit2/WebProcess/Network/WebResourceLoadScheduler.cpp:155 > + resourceLoader->start(); > m_webResourceLoaders.set(identifier, WebResourceLoader::create(resourceLoader)); I know it’s only one or two lines of code, but I think a helper function might be nice for the m_webResourceLoaders thing that’s now repeated four times.
> I don’t think we need this constructor. We should just be able to write: > > std::make_unique<DecodeTask>({ ... }); That wouldn't compile. More awkward version does: std::make_unique<DecodeTask>(DecodeTask { ... }); > > Source/WebCore/platform/network/DataURLDecoder.cpp:68 > > + ASSERT(urlString.startsWith(dataString)); > > This check needs to ignore ASCII case. I believe protocol part of URL is always converted to lowercase > > Source/WebCore/platform/network/DataURLDecoder.cpp:72 > > + if (headerEnd == notFound) > > + return nullptr; > > Do we have a test case that covers this? Probably not. We have some explicit data url test coverage and a ton of incidental coverage. It would be good to add a bunch of more systematic cases covering all the edges. > > Source/WebCore/platform/network/DataURLDecoder.h:49 > > +void decode(const URL&, DecodeCompletionHandler); > > Don’t functions sometimes have state and are thus expensive to copy? If so, > it might be nicer to take ownership of the completion handler rather than > passing it by value. std::function has efficient move constructor. I think we just need to use WTF::move in a few places? (instead of say DecodeCompletionHandler&&) > > Source/WebCore/platform/text/DecodeEscapeSequences.h:170 > > + encodedRunPosition = length; > > I’m not sure this line of code is needed. We intentionally made > encodedRunPosition a very large value so it doesn’t need to be explicitly > checked for. Using StringView::substring with it should work because it > clamps its arguments down to the length of the string. It’s even possible, > depending on how URLEscapeSequence::findEndOfRun is written, that we could > remove the if statement entirely. Ok. I didn't know that notFound value was intentionally chosen for this purpose.
Created attachment 259659 [details] patch
Attachment 259659 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/DataURLDecoder.h:47: Extra space before ( in function call [whitespace/parens] [4] Total errors found: 1 in 13 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 259659 [details] patch Attachment 259659 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/87137 New failing tests: editing/execCommand/apply-style-command-crash.html fast/hidpi/image-srcset-data-srcset-invalid-inputs.html editing/style/apply-style-crash.html fast/css/url-with-multi-byte-unicode-escape.html
Created attachment 259663 [details] Archive of layout-test-results from ews103 for mac-mavericks The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-mavericks Platform: Mac OS X 10.9.5
Comment on attachment 259659 [details] patch Attachment 259659 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/87192 New failing tests: editing/execCommand/apply-style-command-crash.html fast/hidpi/image-srcset-data-srcset-invalid-inputs.html editing/style/apply-style-crash.html fast/css/url-with-multi-byte-unicode-escape.html
Created attachment 259668 [details] Archive of layout-test-results from ews104 for mac-mavericks-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-mavericks-wk2 Platform: Mac OS X 10.9.5
Created attachment 259715 [details] patch
Attachment 259715 [details] did not pass style-queue: ERROR: Source/WebCore/platform/network/DataURLDecoder.h:47: Extra space before ( in function call [whitespace/parens] [4] Total errors found: 1 in 13 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 259715 [details] patch Clearing flags on attachment: 259715 Committed r188820: <http://trac.webkit.org/changeset/188820>
All reviewed patches have been landed. Closing bug.
This made at least one test flaky, please see bug 148533.