RESOLVED FIXED262146
Parse 'systemLanguage' as a comma separated list
https://bugs.webkit.org/show_bug.cgi?id=262146
Summary Parse 'systemLanguage' as a comma separated list
Ahmad Saleem
Reported 2023-09-26 15:10:25 PDT
Hi Team, Going through another bug, I came across another failing test: Test Case: https://jsfiddle.net/sq8mtdgc/show ^ Fails in Safari / WebKit ToT while passes Chrome Canary 119 and Firefox Nightly 120. Blink Commit: https://chromium.googlesource.com/chromium/src.git/+/be92af090cae7b5f69f8cc33cbe0a4e9c5f37e27 Just wanted to raise, so we can fix it as well. Thanks!
Attachments
Karl Dubost
Comment 1 2023-09-27 00:30:49 PDT
Hmm, Where could it fail? systemLanguage is parsed as a SVGStringList https://searchfox.org/wubkat/rev/42dc4893aca2f5a0e36c4314c8aa0555ebce6c88/Source/WebCore/svg/SVGTests.h#38-51 SVGStringList& systemLanguage() { return m_systemLanguage; } And SVGStringList.parse https://searchfox.org/wubkat/rev/42dc4893aca2f5a0e36c4314c8aa0555ebce6c88/Source/WebCore/svg/SVGStringList.h#47 And SVGStringList::parse(StringView data, UChar delimiter) https://searchfox.org/wubkat/rev/42dc4893aca2f5a0e36c4314c8aa0555ebce6c88/Source/WebCore/svg/SVGStringList.cpp#34-64 but reset defines the parse(string, ' '); Only with space? is it because of this?
Karl Dubost
Comment 2 2023-09-27 00:50:53 PDT
data:text/html,<svg><text systemLanguage="en-US, zh-Hans,zh-Hant"></text></svg> document.querySelector('text').systemLanguage returns an SVGStringList with indeed * en-US, * zh-Hans,zh-Hant even with a simpler test data:text/html,<svg><text systemLanguage="en,fr,de"></text></svg> it returns * en,fr,de That's bad indeed. I never gets it right "en,fr,de" -> { 0: "en,fr,de" } "en, ,de" -> { 0: "en,", 1: ",de" } "en,,de" -> { 0: "en,,de" } "en,,de" -> { 0: "en,,de" } "en, fr, de" -> { 0: "en,", 1: "fr,", 3: "de" } It separates on space, and never removes the comma. What does the spec says? https://www.w3.org/TR/SVG2/struct.html#ConditionalProcessingSystemLanguageAttribute Name: systemLanguage Value: set of comma-separated tokens [HTML] Initial: (none) defined in https://html.spec.whatwg.org/multipage/common-microsyntaxes.html#set-of-comma-separated-tokens A set of comma-separated tokens is a string containing zero or more tokens each separated from the next by a single U+002C COMMA character (,), where tokens consist of any string of zero or more characters, neither beginning nor ending with ASCII whitespace, nor containing any U+002C COMMA characters (,), and optionally surrounded by ASCII whitespace. For instance, the string " a ,b,,d d " consists of four tokens: "a", "b", the empty string, and "d d". Leading and trailing whitespace around each token doesn't count as part of the token, and the empty string can be a token. The confusion comes probably from when it was written because requiredExtensions is a **space** separated attribute. https://www.w3.org/TR/SVG2/struct.html#ConditionalProcessingRequiredExtensionsAttribute
Karl Dubost
Comment 3 2023-09-27 00:59:01 PDT
And this is the interface for SVGStringList https://www.w3.org/TR/SVG2/types.html#InterfaceSVGStringList
Radar WebKit Bug Importer
Comment 4 2023-10-03 15:11:19 PDT
serakeri
Comment 5 2024-10-22 14:25:45 PDT
Karl Dubost
Comment 6 2026-04-08 06:42:28 PDT
This needs to be reworked a bit. There is basically not tests coverage. The previous tests are SVG 1.1 and some unrelated animation tests. SVGTests is the C++ class that implements the conditional processing attributes from the SVG spec: requiredExtensions and systemLanguage. https://w3c.github.io/svgwg/svg2-draft/struct.html#ConditionalProcessing https://searchfox.org/wubkat/source/Source/WebCore/svg/SVGTests.cpp Any SVG element that can be conditionally shown or hidden (like elements inside a <switch>) mixes in SVGTests. It does two things: 1. Parses the attribute values from markup into an SVGStringList (the parseAttribute() method) 2. Evaluates whether the element should render, by checking if the listed languages/extensions match the user's environment (the isValid() method) So when we write: <switch> <text systemLanguage="fr">Bonjour</text> <text systemLanguage="en">Hello</text> <text>Fallback</text> </switch> There are WPT tests for SVG switch but probably not enough. This could be for another bug. https://wpt.fyi/results/svg/animations https://wpt.fyi/results/svg/struct/reftests/use-switch.svg SVGTests::isValid() is what decides which <text> gets rendered — it reads the parsed tokens from systemLanguage and compares them against the browser's language setting. https://searchfox.org/wubkat/rev/9971986dcebbc9ad5713a058e7951288c455bbee/Source/WebCore/svg/SVGTests.cpp#91-103 ```cpp void SVGTests::parseAttribute(const QualifiedName& attributeName, const AtomString& value) { switch (attributeName.nodeName()) { case AttributeNames::requiredExtensionsAttr: protect(requiredExtensions())->reset(value); break; case AttributeNames::systemLanguageAttr: protect(systemLanguage())->reset(value); break; default: break; } } ``` This method is called by the SVG element when the HTML parser (or setAttribute() from JavaScript) encounters a requiredExtensions or systemLanguage attribute. The value parameter is the raw attribute string, e.g. "en, fr, de". Step by step: 1. attributeName.nodeName() identifies which attribute we're dealing with. The switch dispatches to the right handler. 2. systemLanguage() returns a reference to the SVGStringList object that stores the parsed tokens for this element's systemLanguage attribute. It's a list of strings, like a Vector<String>. 3. protect(...) wraps the reference in a Ref smart pointer to keep it alive during the call (SaferCPP pattern — the object could theoretically be destroyed mid-operation if something re-entrant happens). 4. ->reset(value) this is the key part. It clears the list and re-parses the raw string into tokens. Currently reset() calls parse(string, ' ') — splitting on spaces. That's the bug. For systemLanguage, it should split on commas. So the entire flow is: raw string "en, fr, de" → reset() → parse(string, ' ') → tokens ["en,", "fr,", "de"] (wrong). If we change to parse(string, ',') → tokens ["en", "fr", "de"] (correct). We need WPT tests.
Karl Dubost
Comment 7 2026-04-08 07:46:30 PDT
Karl Dubost
Comment 8 2026-04-08 08:08:51 PDT
This is fixing the parsing. The validation will be fixed in https://bugs.webkit.org/show_bug.cgi?id=262140
EWS
Comment 9 2026-04-17 03:04:25 PDT
Committed 311439@main (94ce6308f502): <https://commits.webkit.org/311439@main> Reviewed commits have been landed. Closing PR #62274 and removing active labels.
Karl Dubost
Comment 10 2026-04-17 08:03:21 PDT
Submitted web-platform-tests pull request: https://github.com/web-platform-tests/wpt/pull/59317
Note You need to log in before you can comment on or make changes to this bug.