Bug 262146
| Summary: | Parse 'systemLanguage' as a comma separated list | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Ahmad Saleem <ahmad.saleem792> |
| Component: | SVG | Assignee: | Karl Dubost <karlcow> |
| Status: | RESOLVED FIXED | ||
| Severity: | Normal | CC: | karlcow, sabouhallawa, webkit-bug-importer, zimmermann |
| Priority: | P2 | Keywords: | BrowserCompat, GoodFirstBug, InRadar |
| Version: | Safari Technology Preview | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| See Also: |
https://bugs.webkit.org/show_bug.cgi?id=262140 https://github.com/web-platform-tests/wpt/pull/59317 |
||
Ahmad Saleem
Hi Team,
Going through another bug, I came across another failing test:
Test Case: https://jsfiddle.net/sq8mtdgc/show
^ Fails in Safari / WebKit ToT while passes Chrome Canary 119 and Firefox Nightly 120.
Blink Commit: https://chromium.googlesource.com/chromium/src.git/+/be92af090cae7b5f69f8cc33cbe0a4e9c5f37e27
Just wanted to raise, so we can fix it as well.
Thanks!
| Attachments | ||
|---|---|---|
| Add attachment proposed patch, testcase, etc. |
Karl Dubost
Hmm, Where could it fail?
systemLanguage is parsed as a SVGStringList
https://searchfox.org/wubkat/rev/42dc4893aca2f5a0e36c4314c8aa0555ebce6c88/Source/WebCore/svg/SVGTests.h#38-51
SVGStringList& systemLanguage() { return m_systemLanguage; }
And SVGStringList.parse
https://searchfox.org/wubkat/rev/42dc4893aca2f5a0e36c4314c8aa0555ebce6c88/Source/WebCore/svg/SVGStringList.h#47
And SVGStringList::parse(StringView data, UChar delimiter)
https://searchfox.org/wubkat/rev/42dc4893aca2f5a0e36c4314c8aa0555ebce6c88/Source/WebCore/svg/SVGStringList.cpp#34-64
but reset defines the parse(string, ' ');
Only with space? is it because of this?
Karl Dubost
data:text/html,<svg><text systemLanguage="en-US, zh-Hans,zh-Hant"></text></svg>
document.querySelector('text').systemLanguage
returns an SVGStringList with indeed
* en-US,
* zh-Hans,zh-Hant
even with a simpler test
data:text/html,<svg><text systemLanguage="en,fr,de"></text></svg>
it returns
* en,fr,de
That's bad indeed. I never gets it right
"en,fr,de" -> { 0: "en,fr,de" }
"en, ,de" -> { 0: "en,", 1: ",de" }
"en,,de" -> { 0: "en,,de" }
"en,,de" -> { 0: "en,,de" }
"en, fr, de" -> { 0: "en,", 1: "fr,", 3: "de" }
It separates on space, and never removes the comma.
What does the spec says?
https://www.w3.org/TR/SVG2/struct.html#ConditionalProcessingSystemLanguageAttribute
Name: systemLanguage
Value: set of comma-separated tokens [HTML]
Initial: (none)
defined in https://html.spec.whatwg.org/multipage/common-microsyntaxes.html#set-of-comma-separated-tokens
A set of comma-separated tokens is a string containing zero or more tokens each separated from the next by a single U+002C COMMA character (,), where tokens consist of any string of zero or more characters, neither beginning nor ending with ASCII whitespace, nor containing any U+002C COMMA characters (,), and optionally surrounded by ASCII whitespace.
For instance, the string " a ,b,,d d " consists of four tokens: "a", "b", the empty string, and "d d". Leading and trailing whitespace around each token doesn't count as part of the token, and the empty string can be a token.
The confusion comes probably from when it was written because
requiredExtensions is a **space** separated attribute.
https://www.w3.org/TR/SVG2/struct.html#ConditionalProcessingRequiredExtensionsAttribute
Karl Dubost
And this is the interface for SVGStringList
https://www.w3.org/TR/SVG2/types.html#InterfaceSVGStringList
Radar WebKit Bug Importer
<rdar://problem/116427520>
serakeri
Pull request: https://github.com/WebKit/WebKit/pull/35582
Karl Dubost
This needs to be reworked a bit. There is basically not tests coverage. The previous tests are SVG 1.1 and some unrelated animation tests.
SVGTests is the C++ class that implements the conditional processing attributes from the SVG spec: requiredExtensions and systemLanguage.
https://w3c.github.io/svgwg/svg2-draft/struct.html#ConditionalProcessing
https://searchfox.org/wubkat/source/Source/WebCore/svg/SVGTests.cpp
Any SVG element that can be conditionally shown or hidden (like elements inside a <switch>) mixes in SVGTests. It does two things:
1. Parses the attribute values from markup into an SVGStringList (the parseAttribute() method)
2. Evaluates whether the element should render, by checking if the listed languages/extensions match the user's environment (the isValid() method)
So when we write:
<switch>
<text systemLanguage="fr">Bonjour</text>
<text systemLanguage="en">Hello</text>
<text>Fallback</text>
</switch>
There are WPT tests for SVG switch but probably not enough. This could be for another bug.
https://wpt.fyi/results/svg/animations
https://wpt.fyi/results/svg/struct/reftests/use-switch.svg
SVGTests::isValid() is what decides which <text> gets rendered — it reads the parsed tokens from systemLanguage and compares them against the browser's language setting.
https://searchfox.org/wubkat/rev/9971986dcebbc9ad5713a058e7951288c455bbee/Source/WebCore/svg/SVGTests.cpp#91-103
```cpp
void SVGTests::parseAttribute(const QualifiedName& attributeName, const AtomString& value)
{
switch (attributeName.nodeName()) {
case AttributeNames::requiredExtensionsAttr:
protect(requiredExtensions())->reset(value);
break;
case AttributeNames::systemLanguageAttr:
protect(systemLanguage())->reset(value);
break;
default:
break;
}
}
```
This method is called by the SVG element when the HTML parser (or setAttribute() from JavaScript) encounters a requiredExtensions or systemLanguage attribute. The value parameter is the raw attribute string, e.g. "en, fr, de".
Step by step:
1. attributeName.nodeName()
identifies which attribute we're dealing with. The switch dispatches to the right handler.
2. systemLanguage()
returns a reference to the SVGStringList object that stores the parsed tokens for this element's systemLanguage attribute. It's a list of strings, like a Vector<String>.
3. protect(...)
wraps the reference in a Ref smart pointer to keep it alive during the call (SaferCPP pattern — the object could theoretically be destroyed mid-operation if something re-entrant happens).
4. ->reset(value)
this is the key part. It clears the list and re-parses the raw string into tokens. Currently reset() calls parse(string, ' ') — splitting on spaces. That's the bug. For systemLanguage, it should split on commas.
So the entire flow is:
raw string "en, fr, de"
→ reset()
→ parse(string, ' ')
→ tokens ["en,", "fr,", "de"] (wrong).
If we change to parse(string, ',')
→ tokens ["en", "fr", "de"] (correct).
We need WPT tests.
Karl Dubost
Pull request: https://github.com/WebKit/WebKit/pull/62274
Karl Dubost
This is fixing the parsing.
The validation will be fixed in https://bugs.webkit.org/show_bug.cgi?id=262140
EWS
Committed 311439@main (94ce6308f502): <https://commits.webkit.org/311439@main>
Reviewed commits have been landed. Closing PR #62274 and removing active labels.
Karl Dubost
Submitted web-platform-tests pull request: https://github.com/web-platform-tests/wpt/pull/59317