RESOLVED FIXED180539
ApplicationManifestParser should strip whitespace from the raw input
https://bugs.webkit.org/show_bug.cgi?id=180539
Summary ApplicationManifestParser should strip whitespace from the raw input
David Quesada
Reported 2017-12-07 11:45:42 PST
Parsing manifests with additional whitespace seemed to work in earlier versions of my patches that used JavaScriptCore for parsing the manifest. The current method of parsing JSON doesn't like this, and many manifests from the web that were working before no longer parse.
Attachments
Patch (4.94 KB, patch)
2017-12-07 14:04 PST, David Quesada
joepeck: review+
Patch v2 (6.12 KB, patch)
2017-12-08 09:14 PST, David Quesada
joepeck: review+
Patch for landing (6.41 KB, patch)
2017-12-08 11:12 PST, David Quesada
no flags
Archive of layout-test-results from ews104 for mac-elcapitan-wk2 (2.56 MB, application/zip)
2017-12-08 12:59 PST, EWS Watchlist
no flags
Radar WebKit Bug Importer
Comment 1 2017-12-07 11:50:31 PST
David Quesada
Comment 2 2017-12-07 14:04:28 PST
Joseph Pecoraro
Comment 3 2017-12-07 16:41:49 PST
Comment on attachment 328735 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328735&action=review r=me > Source/WebCore/Modules/applicationmanifest/ApplicationManifestParser.cpp:60 > RefPtr<JSON::Value> jsonValue; > - if (!JSON::Value::parseJSON(text, jsonValue)) { > + if (!JSON::Value::parseJSON(text.stripWhiteSpace(), jsonValue)) { Does this actually change behavior? parseJSON looks to ignore leading whitespace: > Token parseToken(const UChar* start, const UChar* end, const UChar** tokenStart, const UChar** tokenEnd) > { > while (start < end && isSpaceOrNewline(*start)) > ++start; > ... It might also be good to have a link to the spec at the top of this file somewhere for quick reference. I suspect that is: https://w3c.github.io/manifest/ This part appears to be: https://w3c.github.io/manifest/#dfn-obtaining-the-manifest 1. Let json be the result of parsing text. If parsing throws an error: ... And here parseJSON(text) should satisfy this without the stripWhiteSpace.
David Quesada
Comment 4 2017-12-07 17:00:57 PST
(In reply to Joseph Pecoraro from comment #3) > Comment on attachment 328735 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=328735&action=review > > r=me > > > Source/WebCore/Modules/applicationmanifest/ApplicationManifestParser.cpp:60 > > RefPtr<JSON::Value> jsonValue; > > - if (!JSON::Value::parseJSON(text, jsonValue)) { > > + if (!JSON::Value::parseJSON(text.stripWhiteSpace(), jsonValue)) { > > Does this actually change behavior? parseJSON looks to ignore leading > whitespace: > > > Token parseToken(const UChar* start, const UChar* end, const UChar** tokenStart, const UChar** tokenEnd) > > { > > while (start < end && isSpaceOrNewline(*start)) > > ++start; > > ... > > It might also be good to have a link to the spec at the top of this file > somewhere for quick reference. I suspect that is: > https://w3c.github.io/manifest/ > > This part appears to be: > https://w3c.github.io/manifest/#dfn-obtaining-the-manifest > > 1. Let json be the result of parsing text. If parsing throws an error: ... > > And here parseJSON(text) should satisfy this without the stripWhiteSpace. The issue is with trailing whitespace. JSON::Value::parseJSON() (JSONValues.cpp:531) appears to fail if there are any characters remaining after parsing the value: const UChar* start = characters; const UChar* end = start + jsonInput.length(); const UChar* tokenEnd; auto result = buildValue(start, end, &tokenEnd, 0); if (!result || tokenEnd != end) return false; where actual JSON parsing seems to handle it just fine: > JSON.parse("\t[1,2,3] \n") < [1, 2, 3] Also the fact that some manifests that apparently have trailing whitespace (e.g. https://www.progressivewebflap.com) parsed prior to switching to JSON::Value::parseJSON(), but not after, (but did again after this patch), means this *is* a change in behavior.
Joseph Pecoraro
Comment 5 2017-12-07 17:07:24 PST
> > 1. Let json be the result of parsing text. If parsing throws an error: ... > > > > And here parseJSON(text) should satisfy this without the stripWhiteSpace. > > The issue is with trailing whitespace. JSON::Value::parseJSON() > (JSONValues.cpp:531) appears to fail if there are any characters remaining > after parsing the value: Sounds like we should fix JSON::Value::parseJSON then.
David Quesada
Comment 6 2017-12-08 09:14:03 PST
Created attachment 328824 [details] Patch v2 Fix JSON::Value::parseJSON() instead of stripping its input in ApplicationManifestParser.
Joseph Pecoraro
Comment 7 2017-12-08 10:57:50 PST
Comment on attachment 328824 [details] Patch v2 View in context: https://bugs.webkit.org/attachment.cgi?id=328824&action=review > Tools/TestWebKitAPI/Tests/WTF/JSONValue.cpp:645 > + } I'd suggest another section to test leading/trailing whitespace permutations: { RefPtr<JSON::Value> value; EXPECT_TRUE(JSON::Value::parseJSON(" 1", value)); EXPECT_TRUE(JSON::Value::parseJSON("\t1", value)); EXPECT_TRUE(JSON::Value::parseJSON("\n1", value)); EXPECT_TRUE(JSON::Value::parseJSON("1 ", value)); EXPECT_TRUE(JSON::Value::parseJSON("1\t", value)); EXPECT_TRUE(JSON::Value::parseJSON("1\n", value)); EXPECT_TRUE(JSON::Value::parseJSON(" 1 ", value)); EXPECT_TRUE(JSON::Value::parseJSON(" {} ", value)); EXPECT_TRUE(JSON::Value::parseJSON(" [] ", value)); EXPECT_FALSE(JSON::Value::parseJSON("1 1", value)); EXPECT_FALSE(JSON::Value::parseJSON("{} {}", value)); EXPECT_FALSE(JSON::Value::parseJSON("[] []", value)); }
David Quesada
Comment 8 2017-12-08 11:12:49 PST
Created attachment 328837 [details] Patch for landing
EWS Watchlist
Comment 9 2017-12-08 12:59:40 PST Comment hidden (obsolete)
EWS Watchlist
Comment 10 2017-12-08 12:59:42 PST Comment hidden (obsolete)
Joseph Pecoraro
Comment 11 2017-12-08 13:44:38 PST
Comment on attachment 328837 [details] Patch for landing cq+ again, failure is unrelated.
WebKit Commit Bot
Comment 12 2017-12-08 14:05:11 PST
Comment on attachment 328837 [details] Patch for landing Clearing flags on attachment: 328837 Committed r225699: <https://trac.webkit.org/changeset/225699>
WebKit Commit Bot
Comment 13 2017-12-08 14:05:12 PST
All reviewed patches have been landed. Closing bug.
Alexey Proskuryakov
Comment 14 2017-12-08 16:49:57 PST
Comment on attachment 328837 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=328837&action=review > Source/WTF/wtf/JSONValues.cpp:545 > - if (!result || tokenEnd != end) > + if (!result) > return false; > > + for (const UChar* valueEnd = tokenEnd; valueEnd < end; ++valueEnd) { > + if (!isSpaceOrNewline(*valueEnd)) > + return false; > + } Where is the ChangeLog for these changes? Also, a change like this needs a layout test validated against the spec and other browsers, not an API test. > Source/WebCore/ChangeLog:10 > + * Modules/applicationmanifest/ApplicationManifestParser.cpp: > + (WebCore::ApplicationManifestParser::parseManifest): This change is not in the patch, and was not committed.
Alexey Proskuryakov
Comment 15 2017-12-08 16:51:00 PST
Please explain why this JSON parsing change is correct, and add a test. I don't think that we should keep this patch in the tree without that.
Joseph Pecoraro
Comment 16 2017-12-08 18:06:07 PST
(In reply to Alexey Proskuryakov from comment #15) > Please explain why this JSON parsing change is correct, and add a test. I > don't think that we should keep this patch in the tree without that. This is a change to WTF::JSON::Value::parseJSON, used internally to WebKit and not directly exposed to the page (JSC::JSONParse). WTF::JSON::Value::parseJSON allowed leading whitespace but not trailing whitespace, and this change fixed that to allow trailing whitespace. This better aligns WTF::JSON's parse with the JavaScript `JSON.parse(" 1 ")` in webpages as well as JSON parsing in other environments (Ruby and Python in a quick test). As this is not directly exposed to web pages, TestWebKitAPI unit tests were written to test this directly. Additionally a specific Web App Manifest test was added where WTF::JSON::Value is used to parse a JSON manifest specified by the page that previously would have failed. I missed that the ChangeLogs were not updated between the first and second version of the patch in my review. Since it was addressing my review comments I overlooked ChangeLogs. I don't think this needs to be rolled out. Was there a specific kind of test you were looking for?
Alexey Proskuryakov
Comment 17 2017-12-08 19:36:15 PST
I was assuming that we were not crazy enough to have two JSON parsers! If this is not exposed to JS, then no other test is necessary.
Joseph Pecoraro
Comment 18 2017-12-11 10:32:31 PST
(In reply to Alexey Proskuryakov from comment #17) > I was assuming that we were not crazy enough to have two JSON parsers! Heh. The JavaScriptCore JSC::JSONParse parses and immediately produces JSValues tracked by the GC which is hard to work with if you want to interact with values in the JSON. WTF::JSON::Value::parseJSON parses and produces WTF::JSON::Values so it is basically a dictionary type easy to query, build, and stringify.
Note You need to log in before you can comment on or make changes to this bug.