Bug 229553

Summary: input "maxlength" attribute counts grapheme clusters rather than code units
Product: WebKit Reporter: Takao Baba <baba>
Component: FormsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED DUPLICATE    
Severity: Normal CC: ap, cdumez, mmaxfield, wenson_hsieh
Priority: P2    
Version: Safari Technology Preview   
Hardware: Unspecified   
OS: macOS 11   
Attachments:
Description Flags
screenshot none

Description Takao Baba 2021-08-26 06:23:20 PDT
Created attachment 436503 [details]
screenshot

Steps to reproduce:
1. Open https://jsbin.com/pujuyizuze/1/edit?html,output
2. Enter "πŸ‘¨β€πŸ‘¨β€πŸ‘¦".
  * Note: This is a character of one grapheme cluster, but has five Unicode code points (Man/ZWJ/Man/ZWJ/Boy). The "length" of this character is "8".

Expected behavior:
Just "πŸ‘¨β€" is pasted, then no more characters can be added.

Actual behavior:
Entire "πŸ‘¨β€πŸ‘¨β€πŸ‘¦" is pasted. Furthermore, totally three "πŸ‘¨β€πŸ‘¨β€πŸ‘¦" character can be input.

As the spec described, "maxlength" must address the "length", in other words "16-bit integers". Using grapheme cluster is incorrect.

https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#attr-fe-maxlength
> The "number of characters" is measured using length

https://infra.spec.whatwg.org/#string-length
> A string’s length is the number of code units it contains.

Other browsers such as Chrome, Firefox and Edge work correct.
Comment 1 Alexey Proskuryakov 2021-08-26 17:11:31 PDT
This is intentional behavior, and changing it would be user hostile. Perhaps we need to follow up on standard changes mentioned in bug 120030.

*** This bug has been marked as a duplicate of bug 120030 ***