NEW 20027
Keyup/keydown keyIdentifier assignments are not W3C compliant
https://bugs.webkit.org/show_bug.cgi?id=20027
Summary Keyup/keydown keyIdentifier assignments are not W3C compliant
Allan Jacobs
Reported 2008-07-13 07:12:53 PDT
Safari, on Windows XP, attached an attribute named keyIdentifier to Event objects. The intent was clearly an attempt to implement part of the behavior of the textInput event type as described in the W3C DOM Level 3 Event specification. The intent was good but the execution is not in compliance with the specification. On all layouts, Safari is simply prepending "U+" to the value of the keyCode in hex. This works for simple ASCII characters but is not correct for thousands of other keys. The consequences for web development are severe for keys on the extreme right of the keyboard (so-called OEM keys) that are often associated with punctuation characters. It also makes coding much less natural for languages other than English. This bug report addresses the hard part of the problem: what to do with keys that are used for letters and digits. Control characters (like arrow keys or the Ctrl key) are not addressed. The keyIdentifier should be a function of both layout and keycode. The keyIdentifier should not simply be the keycode converted into hex. The W3C specification says that the assignment should reflect the Unicode values of the characters produced by the key. The algorithm used here: 1. Use a numeric value if it is available in normal, shift, or AltGr state. 2. If 1 does not apply and if the key has an upper/lower case assignment that is state dependent, use the Unicode for an upper case character. 3. If neither 1 nor 2 applies, use the Unicode for the normal state key assignment. 4. If 1, 2, and 3 do not apply, use the Unicode for the shift state key assignment. 5. If 1, 2, 3, and 4 do not apply, use the Unicode for the AltGr state key assignment. Shift state is obtained by depressing the Shift key and then the key in question. The less familiar AltGr state obtains when epressing the right Alt key and then the key in question. The Shift+AltGr state obtains when the Shift and right Alt keys are depressed when the key in question is keyed down. Normal state obtains when a key is depressed and neither the Shift nor the Alt keys are depressed. There is one more state that obtains for the Hebrew layout that is not relevant to this bug. The algorithm sketched above can be used to construct a new mapping for each of the Windows keyboard layouts. The table does not fit in a Bugzilla description field. The mapping starts out as follows: mysql> select distinct layout,keycode,upper(w3cIdentifier) from keymap where os='Win' and browser='IE' order by layout,keycode; +---------------------------------------+---------+----------------------+ | layout | keycode | upper(w3cIdentifier) | +---------------------------------------+---------+----------------------+ | Albanian | 48 | U+0030 | | Albanian | 49 | U+0031 | | Albanian | 50 | U+0032 | | Albanian | 51 | U+0033 | | Albanian | 52 | U+0034 | | Albanian | 53 | U+0035 | | Albanian | 54 | U+0036 | | Albanian | 55 | U+0037 | | Albanian | 56 | U+0038 | | Albanian | 57 | U+0039 | | Albanian | 65 | U+0041 | | Albanian | 66 | U+0042 | | Albanian | 67 | U+0043 | | Albanian | 68 | U+0044 | | Albanian | 69 | U+0045 | | Albanian | 70 | U+0046 | | Albanian | 71 | U+0047 | | Albanian | 72 | U+0048 | | Albanian | 73 | U+0049 | | Albanian | 74 | U+004A | | Albanian | 75 | U+004B | | Albanian | 76 | U+004C | | Albanian | 77 | U+004D | | Albanian | 78 | U+004E | | Albanian | 79 | U+004F | | Albanian | 80 | U+0050 | | Albanian | 81 | U+0051 | | Albanian | 82 | U+0052 | | Albanian | 83 | U+0053 | | Albanian | 84 | U+0054 | | Albanian | 85 | U+0055 | | Albanian | 86 | U+0056 | | Albanian | 87 | U+0057 | | Albanian | 88 | U+0058 | | Albanian | 89 | U+0059 | | Albanian | 90 | U+005A | | Albanian | 186 | U+00CB | | Albanian | 187 | U+003D | | Albanian | 188 | U+002C | | Albanian | 189 | U+002D | | Albanian | 190 | U+002E | | Albanian | 191 | U+002F | | Albanian | 192 | U+005C | | Albanian | 219 | U+00C7 | | Albanian | 220 | U+005D | | Albanian | 221 | U+0040 | | Albanian | 222 | U+005B | | Albanian | 226 | U+003C | | Arabic (101) | 48 | U+0030 | | Arabic (101) | 49 | U+0031 | | Arabic (101) | 50 | U+0032 | | Arabic (101) | 51 | U+0033 | | Arabic (101) | 52 | U+0034 | | Arabic (101) | 53 | U+0035 | | Arabic (101) | 54 | U+0036 | | Arabic (101) | 55 | U+0037 | | Arabic (101) | 56 | U+0038 | | Arabic (101) | 57 | U+0039 | | Arabic (101) | 65 | U+0634 | | Arabic (101) | 66 | U+0644 | | Arabic (101) | 67 | U+0624 | | Arabic (101) | 68 | U+064A | | Arabic (101) | 69 | U+062B | | Arabic (101) | 70 | U+0628 | | Arabic (101) | 71 | U+0644 | | Arabic (101) | 72 | U+0627 | | Arabic (101) | 73 | U+0647 | | Arabic (101) | 74 | U+062A | | Arabic (101) | 75 | U+0646 | | Arabic (101) | 76 | U+0645 | | Arabic (101) | 77 | U+0649 | | Arabic (101) | 78 | U+0627 | | Arabic (101) | 79 | U+062E | | Arabic (101) | 80 | U+062D | | Arabic (101) | 81 | U+0636 | | Arabic (101) | 82 | U+0642 | | Arabic (101) | 83 | U+0633 | | Arabic (101) | 84 | U+0641 | | Arabic (101) | 85 | U+0639 | | Arabic (101) | 86 | U+0631 | | Arabic (101) | 87 | U+0635 | | Arabic (101) | 88 | U+0621 | | Arabic (101) | 89 | U+063A | | Arabic (101) | 90 | U+0626 | | Arabic (101) | 186 | U+0643 | | Arabic (101) | 187 | U+003D | | Arabic (101) | 188 | U+0629 | | Arabic (101) | 189 | U+002D | | Arabic (101) | 190 | U+0648 | | Arabic (101) | 191 | U+0632 | | Arabic (101) | 192 | U+0630 | | Arabic (101) | 219 | U+062C | | Arabic (101) | 220 | U+005C | | Arabic (101) | 221 | U+062F | | Arabic (101) | 222 | U+0637 |
Attachments
Keycode to keyIdentifier mapping for Windows keyboard layouts (419.86 KB, text/plain)
2008-07-13 07:15 PDT, Allan Jacobs
no flags
Testcase demonstrating keyIdentifier assignments (5.13 KB, text/html)
2008-07-13 09:28 PDT, Allan Jacobs
no flags
Keycode/charIdentifier to w3cIdentifier assignment on Ubuntu Linux 9.10 / Chrome 4.0.249.4. (7.76 MB, text/plain)
2009-11-22 12:11 PST, Allan Jacobs
no flags
SQL to assign w3c identifiers to keyboard events. (235.77 KB, text/x-sql)
2009-11-22 12:19 PST, Allan Jacobs
no flags
Allan Jacobs
Comment 1 2008-07-13 07:15:10 PDT
Created attachment 22260 [details] Keycode to keyIdentifier mapping for Windows keyboard layouts
Allan Jacobs
Comment 2 2008-07-13 09:28:02 PDT
Created attachment 22261 [details] Testcase demonstrating keyIdentifier assignments Testcase. Add Arabic (101) layout. Use the Windows Control panel. Choose Regional and Language Options. In the Regional and Language Options dialog, choose the Languages pane and click on Details. Try adding Arabic/Arabic (101). Once added, use the operating system to change the layout for the Safari browser window to Arabic (101). Type in some characters into the text field. The application will tabulate some of the keyCode (1st column) and keyIdentifier (fourth column) assignments. The second column is the Unicode of the character that the key in this layout produces (retrieved by combining keydown with keypress information). For instance, typing 'asdf' keys in sequence: 65 U+0634 ش U+0041 83 U+0633 س U+0053 68 U+064A ي U+0044 70 U+0628 ب U+0046
Mark Rowe (bdash)
Comment 3 2008-07-14 12:49:15 PDT
Alexey Proskuryakov
Comment 4 2008-07-15 14:08:35 PDT
Please note that the specification is still in draft, and thus is subject to change. Without a rationale available, it is hard to predict which direction the specification will take - e.g. some use cases call for a key identifier that is NOT dependent on the layout.
Allan Jacobs
Comment 5 2008-07-23 20:49:01 PDT
There were a few misassignments of keycodes made that have consequences for keyIdentifier assignments. These were discovered by comparing the keyIdentifier assignments independently made for Firefox and Opera. With these corrections, it is my belief that the probability of an error is now roughly 1 in 10000 for Firefox and Internet Explorer. The error rate for Opera is larger. The database detected conflicts in keyCode assignments for Opera that cannot be patched. The assignment Czech (QWERTY) 32 U+00b4 should not have been made at all. The line reading Czech (QWERTY) 187 U+02c7 should read Czech (QWERTY) 187 U+00b4 The line reading Czech (QWERTY) 220 U+0027 should read Czech (QWERTY) 220 U+00a8 The line reading Latin American 187 U+002a should read Latin American 187 U+002b The line reading Latin American 191 U+002b should read Latin American 191 U+007d
O. Andersen
Comment 6 2009-10-21 15:22:12 PDT
(In reply to comment #4) > some use cases call for a key identifier > that is NOT dependent on the layout. The Dashboard Widget ‘Tastiera’ <http://coq.no/widget/tastiera/en> is an example of such a use case. If the standardisation effort ends up with only layout-dependent identifiers, adding a physical identifier (ADB codes or a more logical key numbering) as well might be a good idea. An additional problem for non-US keyboard layouts is that dead keys cannot be detected: no onkeypress event is generated, and onkeyup/down both give keyIdentifier = Unidentified, which = 0, keyCode = 0. This was reported to Apple as bug No. 6600446. Has there been any progress on this topic lately? Should I open a new bug for the problem with dead keys?
Alexey Proskuryakov
Comment 7 2009-10-21 16:13:45 PDT
> Should I open a new bug for the problem with dead keys? Since the existing Radar bug is only visible to Apple employees, I'd say yes.
O. Andersen
Comment 8 2009-10-21 16:59:15 PDT
> > Should I open a new bug for the problem with dead keys? > Since the existing Radar bug is only visible to Apple employees, I'd say yes. Filed as bug No. 30652.
Allan Jacobs
Comment 9 2009-11-22 12:11:14 PST
Created attachment 43686 [details] Keycode/charIdentifier to w3cIdentifier assignment on Ubuntu Linux 9.10 / Chrome 4.0.249.4. Keycode assignments on Linux are buggy, so keycode and charIdentifier are both included in columns of this file. The SQL used to make the assignments will be attached shortly. The AltGr and Shift+AltGr states for the 105th key on my 105-key keyboard (lower leftmost character key) was often not properly active -- these were culled out with the w3cIdentifer not null clause (top of the attachment).
Allan Jacobs
Comment 10 2009-11-22 12:19:18 PST
Created attachment 43687 [details] SQL to assign w3c identifiers to keyboard events. mysql> describe layout; +--------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+-------------+------+-----+---------+-------+ | id | int(5) | NO | PRI | NULL | | | name | varchar(66) | YES | | NULL | | | display_name | varchar(66) | YES | | NULL | | +--------------+-------------+------+-----+---------+-------+ 3 rows in set (0.00 sec) mysql> describe keymap; +----------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------------+-------------+------+-----+---------+-------+ | id | int(5) | NO | PRI | NULL | | | os | varchar(10) | YES | | NULL | | | browser | varchar(10) | YES | | NULL | | | layout_id | int(5) | YES | MUL | NULL | | | state | varchar(20) | YES | | NULL | | | keycode | int(6) | YES | | NULL | | | charIdentifier | varchar(15) | YES | | NULL | | | w3cIdentifier | varchar(15) | YES | | NULL | | | deadKey | int(1) | YES | | NULL | | | layout | varchar(66) | YES | | NULL | | | comment | varchar(66) | YES | | NULL | | +----------------+-------------+------+-----+---------+-------+ 11 rows in set (0.00 sec) Is there any interest in getting a copy of the database?
Allan Jacobs
Comment 11 2010-09-09 16:45:10 PDT
The W3C specification has changed. Identifying the key independent of state (that is, of a modifier) is no longer one of it's ambitions. This makes this bug irrelevant. Bug 20027 should be closed. "keyidentifier" refers to content at http://www.w3.org/TR/2007/WD-DOM-Level-3-Events-20071221/events.html#Events-KeyboardEvent and at http://www.w3.org/TR/2007/WD-DOM-Level-3-Events-20071221/keyset.html . Refer to http://www.w3.org/TR/DOM-Level-3-Events/#keys-Guide , in a version of the W3C specification (dated Sept 7, 2010 -- two days ago). In the new draft, keyup, keydown, and keypress implement the KeyboardEvent interface which mandates the presence of attributes 'char', 'key', and 'keyCode'. Keycode is legacy. User codes will see a wild variation when changing browsers, operating systems, keyboard locales, and even shift states. For most visible characters, 'char' and 'key' will be assigned the same Unicode character value. The value depends on modifier state. 'char' and 'key' differ for some punctuation characters. For instance, hitting the spacebar causes an event with 'char' set to a space (\u0020) and 'key' set to the string 'Spacebar'. Characters with no character representation have 'char' set to null and 'key' assigned a meaningful value. For instance, an Up Arrow key has 'char'=null and 'key'='Up'.
Eric Seidel (no email)
Comment 12 2012-10-27 01:30:07 PDT
Is this still an issue?
mikolaj.konarski
Comment 13 2015-10-03 09:52:34 PDT
> Is this still an issue? Unfortunately, yes. Handling normal and control keys in the same code is a nightmare, even not taking into account browser quirks. This http://webkitgtk.org/reference/webkitdomgtk/stable/WebKitDOMKeyboardEvent.html is years behind that https://developer.mozilla.org/en-US/docs/Web/API/KeyboardEvent In particular, we are using the non-standard and deprecated https://developer.mozilla.org/en-US/docs/Web/API/KeyboardEvent/keyIdentifier which absolutely doesn't agree with https://developer.mozilla.org/en-US/docs/Web/API/KeyboardEvent/key and I can't see a way of getting the functionality of the latter (apart of coding it from scratch using the functions that we have, hacking around browser quirks, if even possible).
mikolaj.konarski
Comment 14 2016-09-30 16:13:42 PDT
Additionally, Chrome will soon drop keyIdentifier https://www.chromestatus.com/features/5316065118650368 so JS that uses keyIdentifier (because of webkit) will no longer work on Chrome, so it would be incredibly useful if webkit implemented the current standard https://w3c.github.io/uievents/#events-keyboardevents
Note You need to log in before you can comment on or make changes to this bug.