RESOLVED FIXED 43949
Add support for MathML entities
https://bugs.webkit.org/show_bug.cgi?id=43949
Summary Add support for MathML entities
Eric Seidel (no email)
Reported 2010-08-12 22:26:31 PDT
+++ This bug was initially created as a clone of Bug #43595 +++ Add support for MathML entities The original bug got spammed, so making a new one.
Attachments
Now with all build systems including HTMLEntitySearch.* (48.08 KB, patch)
2010-08-12 22:33 PDT, Eric Seidel (no email)
no flags
Now should compile for everyone except chromium and possibly tiger (48.83 KB, patch)
2010-08-13 12:26 PDT, Eric Seidel (no email)
no flags
Now should compile for everyone (50.77 KB, patch)
2010-08-13 14:57 PDT, Eric Seidel (no email)
no flags
Fixed mac build (50.71 KB, patch)
2010-08-13 16:01 PDT, Eric Seidel (no email)
no flags
Patch to fix cr-win builder (1.41 KB, patch)
2010-08-13 22:38 PDT, Eric Seidel (no email)
no flags
Eric Seidel (no email)
Comment 1 2010-08-12 22:26:52 PDT
*** Bug 43595 has been marked as a duplicate of this bug. ***
Eric Seidel (no email)
Comment 2 2010-08-12 22:30:46 PDT
@mmentovai: My gyp changes in the previous patch were partial and wrong. I just copied the gperf rule and modified it, so any typos you saw were also in the gperf rule. :)
Eric Seidel (no email)
Comment 3 2010-08-12 22:33:15 PDT
Created attachment 64303 [details] Now with all build systems including HTMLEntitySearch.*
Eric Seidel (no email)
Comment 4 2010-08-12 23:00:37 PDT
Adam Barth
Comment 5 2010-08-13 09:24:46 PDT
Comment on attachment 64303 [details] Now with all build systems including HTMLEntitySearch.* WebKitTools/ChangeLog:12 + patch will not build! OOPS! Maybe remove this note? WebCore/html/HTMLEntityParser.cpp:35 + #include "HTMLEntityNames.cpp" How does this still compile?
Eric Seidel (no email)
Comment 6 2010-08-13 09:35:08 PDT
Bah. You mean we need the old entity code for the xml code?!?
Eric Seidel (no email)
Comment 7 2010-08-13 12:26:52 PDT
Created attachment 64363 [details] Now should compile for everyone except chromium and possibly tiger
Eric Seidel (no email)
Comment 8 2010-08-13 12:30:54 PDT
Hmm... my patches are having the same trouble on win-ews. I think this is a svn-apply regression. :(
Eric Seidel (no email)
Comment 9 2010-08-13 12:33:57 PDT
Eric Seidel (no email)
Comment 10 2010-08-13 14:24:00 PDT
Odd. It seems to build both debug and release on my machine. Maybe a SL vs. Leopard difference. Maybe I forgot to do a clean build locally?
Adam Barth
Comment 11 2010-08-13 14:38:12 PDT
I bet the generated file was left over.
Eric Seidel (no email)
Comment 12 2010-08-13 14:57:01 PDT
Created attachment 64377 [details] Now should compile for everyone
Eric Seidel (no email)
Comment 13 2010-08-13 15:06:21 PDT
Eric Seidel (no email)
Comment 14 2010-08-13 15:12:44 PDT
Doing a clean build to see if I can figure out what's wrong with Mac. If that doesn't fix it, I'll poke the ews bot.
Eric Seidel (no email)
Comment 15 2010-08-13 15:46:40 PDT
Clean build worked locally. I'll poke the EWS bot...
Eric Seidel (no email)
Comment 16 2010-08-13 16:01:19 PDT
Created attachment 64384 [details] Fixed mac build
Eric Seidel (no email)
Comment 17 2010-08-13 16:15:13 PDT
This looks ready to land. We'll fix any Tiger/Chromium breakage after landing if necessary.
Eric Seidel (no email)
Comment 18 2010-08-13 19:01:13 PDT
Comment on attachment 64384 [details] Fixed mac build Since Adam wrote the patch, I guess I'm supposed to r+ it. The code looks great. Lets hope I did the build system integration right. :)
Eric Seidel (no email)
Comment 19 2010-08-13 20:18:39 PDT
WebKit Review Bot
Comment 20 2010-08-13 20:26:33 PDT
http://trac.webkit.org/changeset/65351 might have broken Chromium Linux Release
Eric Seidel (no email)
Comment 21 2010-08-13 20:29:13 PDT
Eric Seidel (no email)
Comment 22 2010-08-13 20:32:32 PDT
WebKit Review Bot
Comment 23 2010-08-13 21:11:13 PDT
http://trac.webkit.org/changeset/65353 might have broken SnowLeopard Intel Release (Tests)
Eric Seidel (no email)
Comment 24 2010-08-13 22:38:22 PDT
Created attachment 64407 [details] Patch to fix cr-win builder
Eric Seidel (no email)
Comment 25 2010-08-13 22:39:01 PDT
I would have landed this manually, except I have other stuff in my tree which would make me need a full rebuild. :p
WebKit Commit Bot
Comment 26 2010-08-13 23:45:05 PDT
Comment on attachment 64407 [details] Patch to fix cr-win builder Clearing flags on attachment: 64407 Committed r65359: <http://trac.webkit.org/changeset/65359>
WebKit Commit Bot
Comment 27 2010-08-13 23:45:12 PDT
All reviewed patches have been landed. Closing bug.
Eric Seidel (no email)
Comment 28 2010-08-14 00:18:34 PDT
WebKit Review Bot
Comment 29 2010-08-14 00:55:57 PDT
http://trac.webkit.org/changeset/65360 might have broken SnowLeopard Intel Release (Tests)
Mihai Parparita
Comment 30 2010-08-15 10:55:24 PDT
I'm still having trouble building Chromium/TestShell after http://trac.webkit.org/changeset/65351 (which removed HTMLEntityNames.gperf, even with http://trac.webkit.org/changeset/65353. WebKit/chromium/src/WebEntities.cpp still includes HTMLEntityNames.cpp, and uses the Entity struct from it: http://trac.webkit.org/browser/trunk/WebKit/chromium/src/WebEntities.cpp#L50
Eric Seidel (no email)
Comment 31 2010-08-15 11:02:40 PDT
Yup. I could see how that would break. We'll need to re-write WebEntities.cpp. That code looks wrong. Why would chromium even need this code?
Eric Seidel (no email)
Comment 32 2010-08-15 11:05:56 PDT
The easy fix is to restore the HTMLEntityNames.cpp generation code for chromium only. The code as-written is wrong for HTML5. But someone from chromium who understands this code will have to fix that.
Eric Seidel (no email)
Comment 33 2010-08-15 11:11:09 PDT
I suspect the WebEntities stuff is part of Chromiums broken hacks around markup.cpp. iirc, Chromium does its own dom serialization outside of WebCore. Which is super-wrong. The correct fix is to fix markup.cpp to work as Chromium needs it to. But someone who knows what this code is for should comment. I'll see if I can sketch up a patch to fix the build break.
Mihai Parparita
Comment 34 2010-08-15 11:12:21 PDT
(In reply to comment #31) > Why would chromium even need this code? Looks like it was added for http://trac.webkit.org/changeset/52268
Adam Barth
Comment 35 2010-08-15 11:15:44 PDT
> The easy fix is to restore the HTMLEntityNames.cpp generation code for chromium only. I don't think that's the right fix. The minimal thing we should do is move this code over to using HTMLEntityTable. A better fix would be to move this logic into WebCore, where it belongs, and make it correct.
Eric Seidel (no email)
Comment 36 2010-08-15 11:18:37 PDT
The logic appears to be trying to build a reverse lookup map for the entity table. markup.cpp only escapes a few entities, not all of the HTML ones like Chromium seems to. I'm not sure why they would want to escape all possible entities, that would not end up looking like the original source. :) Honestly this code is just horribly wrong, and I think we should try not to touch it. We should retore their build and let someone else fix the wrongness.
Adam Barth
Comment 37 2010-08-15 11:22:45 PDT
> The logic appears to be trying to build a reverse lookup map for the entity table. Right, which is easy to do from HTMLEntityTable. > Honestly this code is just horribly wrong, and I think we should try not to touch it. We should retore their build and let someone else fix the wrongness. It seems very bad to have two out-of-sync lists of all the entities. Surely that's not a good idea.
Eric Seidel (no email)
Comment 38 2010-08-15 12:40:31 PDT
Yes, but this is clearly already an out-of-sync list. Between markup.cpp and this code. Why Chromium has chosen this arbitrary subset of HTML entities to replace when creating markup from a DOM is unclear. Why they have an explicit blacklist instead of a whitelist is also unclear. Moving them to the new list of entities will have a behavior change they may or may not want. Again, completely unclear from this code what they want. Either path forward is fine. My gut would say just restore the old broken behavior to fix their compile. We could also write new broken code for them. But that sounds like more work which seems silly to do since this whole approach/architecture/code-location seems wrong. :)
Frédéric Wang (:fredw)
Comment 39 2011-06-06 07:45:03 PDT
I don't know if it is the right place to report the problem, but a unit test for MathJax detected the following problems with these Webkit-based browsers (on Windows 7): Safari 5.0.5: &lang; and &rang; are not converted to the new values. Chrome 11.0.696.71: &notinE; and &notindot; are converted to ¬inE; and ¬indot; Maybe it's fixed in development versions, but I just wanted to inform Webkit's developers.
Alexey Proskuryakov
Comment 40 2011-06-06 14:12:45 PDT
Please report new issues in new bugs. I have a few quick comments however. > Safari 5.0.5: &lang; and &rang; are not converted to the new values. This was a reckless incompatible spec change. We can update to the new mapping once it's implemented in other browsers, but not just because of a spec change. > Chrome 11.0.696.71: &notinE; and &notindot; are converted to ¬inE; and ¬indot; This is something we need to add support for (please file a new bug). Note this is correct parsing for an engine that doesn't support these character references.
Frédéric Wang (:fredw)
Comment 41 2011-06-06 14:34:41 PDT
(In reply to comment #40) > Please report new issues in new bugs. I have a few quick comments however. I didn't know if this should be dealt in Webkit or if it specific to each browser. So I preferred to leave a comment here rather than opening duplicated or misclassified bugs. > > > Safari 5.0.5: &lang; and &rang; are not converted to the new values. > > This was a reckless incompatible spec change. We can update to the new mapping once it's implemented in other browsers, but not just because of a spec change. It's already in stable versions of Chrome and Firefox, at least (I don't know for the other browsers). > > > Chrome 11.0.696.71: &notinE; and &notindot; are converted to ¬inE; and ¬indot; > > This is something we need to add support for (please file a new bug). Note this is correct parsing for an engine that doesn't support these character references. OK. Again, should I report this in Webkit bugzilla or is it best to do that in a dedicated place for Chrome?
Adam Barth
Comment 42 2011-06-06 15:21:54 PDT
> OK. Again, should I report this in Webkit bugzilla or is it best to do that in a dedicated place for Chrome? WebKit please. Thanks!
Alexey Proskuryakov
Comment 43 2011-06-07 09:57:16 PDT
> It's already in stable versions of Chrome and Firefox, at least (I don't know for the other browsers). If Chrome has updated, then the change is probably in mainline WebKit - I don't think that this is something they customize. Please test with a nightly build from http://nightly.webkit.org or with Safari 5.1 developer preview.
Frédéric Wang (:fredw)
Comment 44 2011-06-09 03:11:42 PDT
(In reply to comment #43) > > It's already in stable versions of Chrome and Firefox, at least (I don't know for the other browsers). > > If Chrome has updated, then the change is probably in mainline WebKit - I don't think that this is something they customize. Please test with a nightly build from http://nightly.webkit.org or with Safari 5.1 developer preview. I just tested with a nightly build. The entities lang and rang are still mapped to the former values.
Adam Barth
Comment 45 2011-06-09 10:36:13 PDT
> I just tested with a nightly build. The entities lang and rang are still mapped to the former values. The entities are stored here: http://trac.webkit.org/browser/trunk/Source/WebCore/html/parser/HTMLEntityNames.in They should be the same for all ports.
Alexey Proskuryakov
Comment 46 2011-06-09 10:43:32 PDT
In any case, please file a new bug (in this Bugzilla).
Frédéric Wang (:fredw)
Comment 47 2012-04-16 12:51:37 PDT
(In reply to comment #45) > > I just tested with a nightly build. The entities lang and rang are still mapped to the former values. > > The entities are stored here: > > http://trac.webkit.org/browser/trunk/Source/WebCore/html/parser/HTMLEntityNames.in > > They should be the same for all ports. I just tested again with the latest release of Safari and lang and rang are now mapped to the new entities. Thanks.
Note You need to log in before you can comment on or make changes to this bug.