Bug 262146 - Parse 'systemLanguage' as a comma separated list
Summary: Parse 'systemLanguage' as a comma separated list
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: SVG (show other bugs)
Version: Safari Technology Preview
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: BrowserCompat, GoodFirstBug, InRadar
Depends on:
Blocks:
 
Reported: 2023-09-26 15:10 PDT by Ahmad Saleem
Modified: 2023-10-03 15:11 PDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ahmad Saleem 2023-09-26 15:10:25 PDT
Hi Team,

Going through another bug, I came across another failing test:

Test Case: https://jsfiddle.net/sq8mtdgc/show

^ Fails in Safari / WebKit ToT while passes Chrome Canary 119 and Firefox Nightly 120.

Blink Commit: https://chromium.googlesource.com/chromium/src.git/+/be92af090cae7b5f69f8cc33cbe0a4e9c5f37e27

Just wanted to raise, so we can fix it as well.

Thanks!
Comment 1 Karl Dubost 2023-09-27 00:30:49 PDT
Hmm, Where could it fail?


systemLanguage is parsed as a SVGStringList
https://searchfox.org/wubkat/rev/42dc4893aca2f5a0e36c4314c8aa0555ebce6c88/Source/WebCore/svg/SVGTests.h#38-51

    SVGStringList& systemLanguage() { return m_systemLanguage; }


And SVGStringList.parse
https://searchfox.org/wubkat/rev/42dc4893aca2f5a0e36c4314c8aa0555ebce6c88/Source/WebCore/svg/SVGStringList.h#47


And SVGStringList::parse(StringView data, UChar delimiter)
https://searchfox.org/wubkat/rev/42dc4893aca2f5a0e36c4314c8aa0555ebce6c88/Source/WebCore/svg/SVGStringList.cpp#34-64


but reset defines the parse(string, ' ');
Only with space? is it because of this?
Comment 2 Karl Dubost 2023-09-27 00:50:53 PDT
data:text/html,<svg><text systemLanguage="en-US, zh-Hans,zh-Hant"></text></svg>

document.querySelector('text').systemLanguage
returns an SVGStringList with indeed

* en-US,
* zh-Hans,zh-Hant


even  with a simpler test
data:text/html,<svg><text systemLanguage="en,fr,de"></text></svg>

it returns
* en,fr,de


That's bad indeed. I never gets it right


"en,fr,de"   -> { 0: "en,fr,de" }
"en, ,de"    -> { 0: "en,", 1: ",de" }
"en,,de"     -> { 0: "en,,de" }
"en,,de"     -> { 0: "en,,de" }
"en, fr, de" -> { 0: "en,", 1: "fr,", 3: "de" }

It separates on space, and never removes the comma. 

What does the spec says?
https://www.w3.org/TR/SVG2/struct.html#ConditionalProcessingSystemLanguageAttribute


Name:    systemLanguage	
Value:   set of comma-separated tokens [HTML]	
Initial: (none)	


defined in https://html.spec.whatwg.org/multipage/common-microsyntaxes.html#set-of-comma-separated-tokens


A set of comma-separated tokens is a string containing zero or more tokens each separated from the next by a single U+002C COMMA character (,), where tokens consist of any string of zero or more characters, neither beginning nor ending with ASCII whitespace, nor containing any U+002C COMMA characters (,), and optionally surrounded by ASCII whitespace.

For instance, the string " a ,b,,d d " consists of four tokens: "a", "b", the empty string, and "d d". Leading and trailing whitespace around each token doesn't count as part of the token, and the empty string can be a token.


The confusion comes probably from when it was written because 
requiredExtensions is a **space** separated attribute.
https://www.w3.org/TR/SVG2/struct.html#ConditionalProcessingRequiredExtensionsAttribute
Comment 3 Karl Dubost 2023-09-27 00:59:01 PDT
And this is the interface for SVGStringList
https://www.w3.org/TR/SVG2/types.html#InterfaceSVGStringList
Comment 4 Radar WebKit Bug Importer 2023-10-03 15:11:19 PDT
<rdar://problem/116427520>