Bug 166485 - CSS hyphens: auto should not work if lang="" is not declared
Summary: CSS hyphens: auto should not work if lang="" is not declared
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: Safari Technology Preview
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2016-12-26 15:13 PST by Simon Pieters (:zcorpan)
Modified: 2022-09-10 10:41 PDT (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Pieters (:zcorpan) 2016-12-26 15:13:01 PST
Equivalent Chromium bug: https://bugs.chromium.org/p/chromium/issues/detail?id=676270

WebKit hyphenates text with 'hyphens: auto' when no language is declared. Firefox does not.

MDN says:

> Hyphenation rules are language-specific. In HTML, the language is determined by the lang attribute, and browsers will hyphenate only if this attribute is present and if an appropriate hyphenation dictionary is available.

Spec says:

> Correct automatic hyphenation requires a hyphenation resource appropriate to the language of the text being broken. The UA is therefore only required to automatically hyphenate text for which the content language is known and for which it has an appropriate hyphenation resource.
>
> Authors should correctly tag their content’s language (e.g. using the HTML lang attribute) in order to obtain correct automatic hyphenation. UAs may refuse to automatically hyphenate untagged content regardless of the hyphens property value.

https://drafts.csswg.org/css-text-3/#valdef-hyphens-auto

Now the spec doesn't forbid it, but I think the intent is that UAs should not hyphenate untagged content.

I don't know if WebKit uses the system language when it is not declared, or if it uses language-agnostic rules, or something else (but does not seem to auto-detect English in my simple test). If it should be automatic, then it seems more reliable to apply language detection than using system language. But I think for now we should just disable it and tell Web developers to specify lang="" correctly.


Test case/demo:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4761

<!DOCTYPE html>
<style> div { border:solid; width:150px; -webkit-hyphens:auto; hyphens:auto; } </style>
No lang
<div>Long words like implementation, initialization, realization, and hyphenation.</div>
lang=en-US
<div lang=en-US>Long words like implementation, initialization, realization, and hyphenation.</div>
Comment 1 Alexey Proskuryakov 2016-12-28 10:09:59 PST
This makes sense in principle, as system language doesn't necessarily match content language. But there are quite a few features that default to system language. From the top of my head: default fonts and font fallback; quotes; spellchecker language; even default character encoding.

I think that preventing hyphenation when a language is not explicitly specified would be inconsistent and confusing.
Comment 2 Simon Pieters (:zcorpan) 2016-12-30 02:33:32 PST
I think those other defaults can also be problematic, especially for users who travel and use someone else's computer (or public computer), or users who visits sites in different languages. I think there has also been some experiments to move away from using system language for at least character encoding fallback in Gecko.

WebKit prevents hyphenation for lang="unknownasdfasdf", which is already inconsistent with the other features you mention.

Maybe we should take this discussion to the CSSWG?
Comment 3 Myles C. Maxfield 2017-01-05 13:08:25 PST
I agree with Alexey. Removing hyphenation which used to "work" would be viewed by our users as a regression.
Comment 4 Simon Pieters (:zcorpan) 2017-01-08 13:51:04 PST
Spec issue opened: https://github.com/w3c/csswg-drafts/issues/869
Comment 5 Simon Pieters (:zcorpan) 2018-09-16 00:44:09 PDT
The spec issue has been resolved to disallow hyphenation when no language is declared.

https://github.com/w3c/csswg-drafts/issues/869#issuecomment-394938653
Comment 6 Radar WebKit Bug Importer 2022-07-14 17:02:53 PDT
<rdar://problem/97043617>
Comment 7 Brent Fulgham 2022-07-15 10:28:16 PDT
The Chrome bug says they fixed it, but Hixie's test case still shows hyphens with No Lang on Chrome 103.0.5060.114.

Firefox (I guess, correctly) does not hyphenate for No Lang.
Comment 8 Brent Fulgham 2022-07-15 10:29:36 PDT
We don't seem to do very well in wpt/css/css-text/hyphens, so pulling in as a bug.