<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>113001</bug_id>
          
          <creation_ts>2013-03-21 21:40:48 -0700</creation_ts>
          <short_desc>Some Hebrew diacritics get messed up on form submission</short_desc>
          <delta_ts>2013-07-31 09:46:15 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Forms</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>INVALID</resolution>
          
          
          <bug_file_loc>http://zapad.org/~ignatiev/temp/w4.php</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P3</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>0</everconfirmed>
          <reporter name="Konstantin">ikn</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>ap</cc>
    
    <cc>rniwa</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>861082</commentid>
    <comment_count>0</comment_count>
      <attachid>194439</attachid>
    <who name="Konstantin">ikn</who>
    <bug_when>2013-03-21 21:40:48 -0700</bug_when>
    <thetext>Created attachment 194439
Source of PHP script to reproduce the problem

When I submit any form which has a text field which contains Hebrew diacritics U+05BC (&quot;dagesh&quot;) and U+05B6 (&quot;segol&quot;), in this order, they get submitted to the server in the *opposite* order: U+05B6, U+05BC . While Hebrew word seems &quot;same&quot; visually, this &quot;fixed&quot; order is invalid (or at least non-standard), and regardless, browser obviously shouldn&apos;t change data entered into the form on its own, under any circumstances.

To demonstrate this issue, I wrote a simple PHP script (attached, and available online at http://zapad.org/~ignatiev/temp/w4.php), which allows user to fill a text field and then upon form submission to compare user input with what was actually submitted (via simple hash sum JavaScript implementation). You can play with it and see that it works fine for almost any text in any language you can enter.

If, however, you use button &quot;initialize&quot;, script will initialize the text field to the string &apos;\u05d1\u05bc\u05b5&apos; (bet-dagesh-segol), and upon form submission the comparison test will FAIL; value submitted will be &apos;\u05d1\u05b5\u05bc&apos; bet-segol-dagesh.

This problem is reproducible in any WebKit-based browser I tried (Chrome Windows/Mac, Safari Mac/Windows/iPhone, Debian 6 &quot;Web browser&quot;, also on the latest &quot;nightly build&quot;, compiled from source on Linux/GTK), while it works fine in IE, Firefox, and (Presto-based) Opera.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>863884</commentid>
    <comment_count>1</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2013-03-26 11:58:08 -0700</bug_when>
    <thetext>&gt; this &quot;fixed&quot; order is invalid (or at least non-standard) 

In fact, &apos;\u05d1\u05bc\u05b5&apos; is not properly normalized - both NFC and NFD forms for this string are &apos;\u05d1\u05b5\u05bc&apos;. Please see &lt;http://unicode.org/reports/tr15/&gt; for discussion of Unicode normalization forms.

Overall, this is expected behavior.

The reason why we normalize to NFC when sending for text is compatibility - since Windows uses NFC everywhere, there can be subtle errors when the text sent from WebKit gets processed by systems that don&apos;t work with decomposed text well.

I can see how in this specific case WebKit becomes an outlier, but this is the cost of being like other browsers in more common cases.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>913387</commentid>
    <comment_count>2</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2013-07-31 09:46:15 -0700</bug_when>
    <thetext>*** Bug 119320 has been marked as a duplicate of this bug. ***</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>194439</attachid>
            <date>2013-03-21 21:40:48 -0700</date>
            <delta_ts>2013-03-21 21:40:48 -0700</delta_ts>
            <desc>Source of PHP script to reproduce the problem</desc>
            <filename>w4.php</filename>
            <type>text/html</type>
            <size>1683</size>
            <attacher name="Konstantin">ikn</attacher>
            
              <data encoding="base64">PCFET0NUWVBFIGh0bWw+CjxodG1sPgogIDxoZWFkPgogICAgPG1ldGEgY29udGVudD0idGV4dC9o
dG1sOyBDSEFSU0VUPVVURi04IiBodHRwLWVxdWl2PSJDb250ZW50LVR5cGUiIC8+CiAgICA8dGl0
bGU+Rm9ybSBzdWJtaXNzaW9uIHRlc3Q8L3RpdGxlPiAgCiAgPC9oZWFkPgogIDxib2R5IHN0eWxl
PSJkaXJlY3Rpb246IGx0cjsiPgogICAgPGZvcm0gbmFtZT0ibWFpbmZvcm0iIGFjdGlvbj0iPD89
QCRfU0VSVkVSWydTQ1JJUFRfTkFNRSddOz8+IiAKCSAgbWV0aG9kPSJnZXQiIG9uc3VibWl0PSJy
ZXR1cm4gb25zdWIoKTsiPgogICAgICA8ZGl2PklucHV0IGZpZWxkOgoJPGlucHV0IHR5cGU9InRl
eHQiIG5hbWU9InYiIG1heGxlbmd0aD0iNTAiIHZhbHVlPSI8Pz1AJF9HRVRbJ3YnXTs/PiI+Cgk8
aW5wdXQgdHlwZT0iYnV0dG9uIiB2YWx1ZT0iJmx0Oz09IGluaXRpYWxpemUiIG9uY2xpY2s9ImNv
cHkoKSI+CiAgICAgIDwvZGl2PgogICAgICA8ZGl2PgoJPGlucHV0IHR5cGU9InN1Ym1pdCIgbmFt
ZT0iZm9ybVN1Ym1pdCIgdmFsdWU9IlRlc3QgY3VycmVudCB3b3JkIj4KCVJlc3VsdDogPHNwYW4g
c3R5bGU9ImZvbnQtd2VpZ2h0OiBib2xkOyBmb250LXNpemU6IGxhcmdlOyIgaWQ9ImNvbXByZXMi
Pj8/Pzwvc3Bhbj4KICAgICAgPC9kaXY+CiAgICAgIDxpbnB1dCBuYW1lPSJoYXNoIiB0eXBlPSJo
aWRkZW4iIHZhbHVlPSI8Pz1AJF9HRVRbJ2hhc2gnXTs/PiI+CiAgICA8L2Zvcm0+CiAgICA8c2Ny
aXB0IHR5cGU9InRleHQvamF2YXNjcmlwdCI+ClN0cmluZy5wcm90b3R5cGUuaGFzaENvZGUgPSBm
dW5jdGlvbigpewoJdmFyIGhhc2ggPSAwOwoJaWYgKHRoaXMubGVuZ3RoID09IDApIHJldHVybiBo
YXNoOwoJZm9yIChpID0gMDsgaSA8IHRoaXMubGVuZ3RoOyBpKyspIHsKCQljaGFyID0gdGhpcy5j
aGFyQ29kZUF0KGkpOwoJCWhhc2ggPSAoKGhhc2g8PDUpLWhhc2gpK2NoYXI7CgkJaGFzaCA9IGhh
c2ggJiBoYXNoOyAvLyBDb252ZXJ0IHRvIDMyYml0IGludGVnZXIKCX0KCXJldHVybiBoYXNoOwp9
CnZhciBlID0gZG9jdW1lbnQuZm9ybXMubWFpbmZvcm0uZWxlbWVudHMudjsKdmFyIGggPSBkb2N1
bWVudC5mb3Jtcy5tYWluZm9ybS5lbGVtZW50cy5oYXNoOwp2YXIgcyA9ICdcdTA1ZDFcdTA1YmNc
dTA1YjUnOwpmdW5jdGlvbiBjb3B5ICgpIHsKICBlLnZhbHVlID0gczsKfQpmdW5jdGlvbiBjb21w
YXJlICgpIHsKICB2YXIgYyA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKCdjb21wcmVzJyk7CiAg
aWYgKGUudmFsdWUuaGFzaENvZGUgKCkudG9TdHJpbmcgKCkgPT0gaC52YWx1ZSkgewogICAgYy5p
bm5lckhUTUwgPSAiUGFzcyI7CiAgICBjLnN0eWxlLmNvbG9yID0gImdyZWVuIjsKICB9CiAgZWxz
ZSB7CiAgICBjLmlubmVySFRNTCA9ICJGQUlMIjsKICAgIGMuc3R5bGUuY29sb3IgPSAicmVkIjsK
ICB9Cn0KZnVuY3Rpb24gb25zdWIgKCkgewogIGgudmFsdWUgPSBlLnZhbHVlLmhhc2hDb2RlICgp
LnRvU3RyaW5nICgpOwogIHJldHVybiB0cnVlOwp9CmlmIChoLnZhbHVlKQogIGNvbXBhcmUgKCk7
CiAgICA8L3NjcmlwdD4KICAgIDxkaXYgc3R5bGU9J2ZvbnQtc2l6ZTogc21hbGw7IGNvbG9yOiBn
cmF5Jz4KICAgICAgPGhyPgogICAgICA8Pz1AJF9TRVJWRVJbJ0hUVFBfVVNFUl9BR0VOVCddOz8+
CiAgICA8L2Rpdj4KICA8L2JvZHk+CjwvaHRtbD4K
</data>

          </attachment>
      

    </bug>

</bugzilla>