Bug 163096 - window.navigator.language incorrectly returns all lowercase string
Summary: window.navigator.language incorrectly returns all lowercase string
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Bindings (show other bugs)
Version: Safari Technology Preview
Hardware: Mac OS X 10.11
: P2 Normal
Assignee: Chris Dumez
URL: https://html.spec.whatwg.org/#dom-nav...
Keywords:
Depends on:
Blocks: 163211
  Show dependency treegraph
 
Reported: 2016-10-06 17:37 PDT by Matt Stow
Modified: 2016-10-10 07:32 PDT (History)
11 users (show)

See Also:


Attachments
Patch (7.72 KB, patch)
2016-10-07 16:13 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Matt Stow 2016-10-06 17:37:08 PDT
When calling window.navigator.language, Safari will return something like "en-us", whereas every other browser returns "en-US".

As per [MDN](https://developer.mozilla.org/en-US/docs/Web/API/NavigatorLanguage/language) and defined in [BCP47](http://www.ietf.org/rfc/bcp/bcp47.txt), valid Extended Language Subtags must be 2*3ALPHA (uppercase).

In the current web app I'm building, this caused havoc with our localization, as the "en-us" didn't match any languages we had localized strings for.
Comment 1 Chris Dumez 2016-10-07 10:58:20 PDT
Specification:
- https://html.spec.whatwg.org/#dom-navigator-language
- https://tools.ietf.org/html/bcp47

I confirmed that the part after the '-' is uppercase in Firefox and Chrome.
Comment 2 Chris Dumez 2016-10-07 11:35:59 PDT
in particular https://tools.ietf.org/html/bcp47#section-2.2.4 for Region subtag, which points to ISO3166-1 for country codes (that all appear to be uppercase).
Comment 3 Chris Dumez 2016-10-07 15:19:42 PDT
As per https://developer.apple.com/reference/corefoundation/1666963-cflocale?language=objc , CFLocale is using BCP-47 language tags. Our Mac implementation is relying on CFLocaleCopyPreferredLanguages() which does return BCP-47 language tags.

However, we then call httpStyleLanguageCode() on them which alters their format (and lower cases them).
Comment 4 Chris Dumez 2016-10-07 15:31:31 PDT
(In reply to comment #3)
> As per
> https://developer.apple.com/reference/corefoundation/1666963-
> cflocale?language=objc , CFLocale is using BCP-47 language tags. Our Mac
> implementation is relying on CFLocaleCopyPreferredLanguages() which does
> return BCP-47 language tags.
> 
> However, we then call httpStyleLanguageCode() on them which alters their
> format (and lower cases them).

The name of this function seems to indicate this formatting is used for HTTP. However, RFC 2616 says:
"""
3.10 Language Tags

A language tag identifies a natural language spoken, written, or otherwise conveyed by human beings for communication of information to other human beings. Computer languages are explicitly excluded. HTTP uses language tags within the Accept-Language and Content- Language fields.

The syntax and registry of HTTP language tags is the same as that defined by RFC 1766 [1]. In summary, a language tag is composed of 1 or more parts: A primary language tag and a possibly empty series of subtags:

        language-tag  = primary-tag *( "-" subtag )
        primary-tag   = 1*8ALPHA
        subtag        = 1*8ALPHA
White space is not allowed within the tag and all tags are case- insensitive. The name space of language tags is administered by the IANA. Example tags include:

       en, en-US, en-cockney, i-cherokee, x-pig-latin
where any two-letter primary-tag is an ISO-639 language abbreviation and any two-letter initial subtag is an ISO-3166 country code. (The last three tags above are not registered tags; all but the last are examples of tags which could be registered in future.)
"""

https://www.ietf.org/rfc/rfc1766.txt says that language tags are case insensitive so the fact that we no longer return lowercase would not break HTTP use-cases.
Comment 5 Chris Dumez 2016-10-07 16:13:19 PDT
Created attachment 290982 [details]
Patch
Comment 6 Geoffrey Garen 2016-10-07 16:51:13 PDT
+CC ap, so he can tell us why we are wrong.
Comment 7 Chris Dumez 2016-10-07 16:51:43 PDT
(In reply to comment #6)
> +CC ap, so he can tell us why we are wrong.

:D
Comment 8 WebKit Commit Bot 2016-10-07 20:33:46 PDT
Comment on attachment 290982 [details]
Patch

Clearing flags on attachment: 290982

Committed r206949: <http://trac.webkit.org/changeset/206949>
Comment 9 WebKit Commit Bot 2016-10-07 20:33:52 PDT
All reviewed patches have been landed.  Closing bug.
Comment 10 Alexey Proskuryakov 2016-10-08 12:20:11 PDT
> https://www.ietf.org/rfc/rfc1766.txt says that language tags are case insensitive so the fact that we no longer return lowercase would not break HTTP use-cases.

HTTP servers break for any attempted change to Accept-Language, so one needs to guess which behavior breaks the least amount of sites. Chrome and Firefox use upper case country codes in http too now, so please file a radar against CFNetwork to consider changing this.