WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED INVALID
Bug 113001
Some Hebrew diacritics get messed up on form submission
https://bugs.webkit.org/show_bug.cgi?id=113001
Summary
Some Hebrew diacritics get messed up on form submission
Konstantin
Reported
2013-03-21 21:40:48 PDT
Created
attachment 194439
[details]
Source of PHP script to reproduce the problem When I submit any form which has a text field which contains Hebrew diacritics U+05BC ("dagesh") and U+05B6 ("segol"), in this order, they get submitted to the server in the *opposite* order: U+05B6, U+05BC . While Hebrew word seems "same" visually, this "fixed" order is invalid (or at least non-standard), and regardless, browser obviously shouldn't change data entered into the form on its own, under any circumstances. To demonstrate this issue, I wrote a simple PHP script (attached, and available online at
http://zapad.org/~ignatiev/temp/w4.php
), which allows user to fill a text field and then upon form submission to compare user input with what was actually submitted (via simple hash sum JavaScript implementation). You can play with it and see that it works fine for almost any text in any language you can enter. If, however, you use button "initialize", script will initialize the text field to the string '\u05d1\u05bc\u05b5' (bet-dagesh-segol), and upon form submission the comparison test will FAIL; value submitted will be '\u05d1\u05b5\u05bc' bet-segol-dagesh. This problem is reproducible in any WebKit-based browser I tried (Chrome Windows/Mac, Safari Mac/Windows/iPhone, Debian 6 "Web browser", also on the latest "nightly build", compiled from source on Linux/GTK), while it works fine in IE, Firefox, and (Presto-based) Opera.
Attachments
Source of PHP script to reproduce the problem
(1.64 KB, text/html)
2013-03-21 21:40 PDT
,
Konstantin
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Alexey Proskuryakov
Comment 1
2013-03-26 11:58:08 PDT
> this "fixed" order is invalid (or at least non-standard)
In fact, '\u05d1\u05bc\u05b5' is not properly normalized - both NFC and NFD forms for this string are '\u05d1\u05b5\u05bc'. Please see <
http://unicode.org/reports/tr15/
> for discussion of Unicode normalization forms. Overall, this is expected behavior. The reason why we normalize to NFC when sending for text is compatibility - since Windows uses NFC everywhere, there can be subtle errors when the text sent from WebKit gets processed by systems that don't work with decomposed text well. I can see how in this specific case WebKit becomes an outlier, but this is the cost of being like other browsers in more common cases.
Alexey Proskuryakov
Comment 2
2013-07-31 09:46:15 PDT
***
Bug 119320
has been marked as a duplicate of this bug. ***
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug