Summary: | Add support for MathML entities | ||
---|---|---|---|
Product: | WebKit | Reporter: | Eric Seidel (no email) <eric> |
Component: | WebCore Misc. | Assignee: | Nobody <webkit-unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | Normal | CC: | abarth, ap, aroben, commit-queue, darin, dbates, dglazkov, eric, fishd, fred.wang, jamesr, japhet, jchaffraix, mark, sausset, tony, webkit-ews, webkit.review.bot, yaar |
Priority: | P2 | ||
Version: | 528+ (Nightly build) | ||
Hardware: | Other | ||
OS: | OS X 10.5 | ||
Bug Depends on: | 43595, 44036 | ||
Bug Blocks: | 84012 | ||
Attachments: |
Description
Eric Seidel (no email)
2010-08-12 22:26:31 PDT
*** Bug 43595 has been marked as a duplicate of this bug. *** @mmentovai: My gyp changes in the previous patch were partial and wrong. I just copied the gperf rule and modified it, so any typos you saw were also in the gperf rule. :) Created attachment 64303 [details]
Now with all build systems including HTMLEntitySearch.*
Attachment 64303 [details] did not build on mac: Build output: http://queues.webkit.org/results/3754101 Comment on attachment 64303 [details]
Now with all build systems including HTMLEntitySearch.*
WebKitTools/ChangeLog:12
+ patch will not build! OOPS!
Maybe remove this note?
WebCore/html/HTMLEntityParser.cpp:35
+ #include "HTMLEntityNames.cpp"
How does this still compile?
Bah. You mean we need the old entity code for the xml code?!? Created attachment 64363 [details]
Now should compile for everyone except chromium and possibly tiger
Hmm... my patches are having the same trouble on win-ews. I think this is a svn-apply regression. :( Attachment 64363 [details] did not build on mac: Build output: http://queues.webkit.org/results/3757134 Odd. It seems to build both debug and release on my machine. Maybe a SL vs. Leopard difference. Maybe I forgot to do a clean build locally? I bet the generated file was left over. Created attachment 64377 [details]
Now should compile for everyone
Attachment 64377 [details] did not build on mac: Build output: http://queues.webkit.org/results/3712146 Doing a clean build to see if I can figure out what's wrong with Mac. If that doesn't fix it, I'll poke the ews bot. Clean build worked locally. I'll poke the EWS bot... Created attachment 64384 [details]
Fixed mac build
This looks ready to land. We'll fix any Tiger/Chromium breakage after landing if necessary. Comment on attachment 64384 [details]
Fixed mac build
Since Adam wrote the patch, I guess I'm supposed to r+ it.
The code looks great. Lets hope I did the build system integration right. :)
Committed r65351: <http://trac.webkit.org/changeset/65351> http://trac.webkit.org/changeset/65351 might have broken Chromium Linux Release Committed r65352: <http://trac.webkit.org/changeset/65352> Committed r65353: <http://trac.webkit.org/changeset/65353> http://trac.webkit.org/changeset/65353 might have broken SnowLeopard Intel Release (Tests) Created attachment 64407 [details]
Patch to fix cr-win builder
I would have landed this manually, except I have other stuff in my tree which would make me need a full rebuild. :p Comment on attachment 64407 [details] Patch to fix cr-win builder Clearing flags on attachment: 64407 Committed r65359: <http://trac.webkit.org/changeset/65359> All reviewed patches have been landed. Closing bug. Committed r65360: <http://trac.webkit.org/changeset/65360> http://trac.webkit.org/changeset/65360 might have broken SnowLeopard Intel Release (Tests) I'm still having trouble building Chromium/TestShell after http://trac.webkit.org/changeset/65351 (which removed HTMLEntityNames.gperf, even with http://trac.webkit.org/changeset/65353. WebKit/chromium/src/WebEntities.cpp still includes HTMLEntityNames.cpp, and uses the Entity struct from it: http://trac.webkit.org/browser/trunk/WebKit/chromium/src/WebEntities.cpp#L50 Yup. I could see how that would break. We'll need to re-write WebEntities.cpp. That code looks wrong. Why would chromium even need this code? The easy fix is to restore the HTMLEntityNames.cpp generation code for chromium only. The code as-written is wrong for HTML5. But someone from chromium who understands this code will have to fix that. I suspect the WebEntities stuff is part of Chromiums broken hacks around markup.cpp. iirc, Chromium does its own dom serialization outside of WebCore. Which is super-wrong. The correct fix is to fix markup.cpp to work as Chromium needs it to. But someone who knows what this code is for should comment. I'll see if I can sketch up a patch to fix the build break. (In reply to comment #31) > Why would chromium even need this code? Looks like it was added for http://trac.webkit.org/changeset/52268 > The easy fix is to restore the HTMLEntityNames.cpp generation code for chromium only.
I don't think that's the right fix. The minimal thing we should do is move this code over to using HTMLEntityTable. A better fix would be to move this logic into WebCore, where it belongs, and make it correct.
The logic appears to be trying to build a reverse lookup map for the entity table. markup.cpp only escapes a few entities, not all of the HTML ones like Chromium seems to. I'm not sure why they would want to escape all possible entities, that would not end up looking like the original source. :) Honestly this code is just horribly wrong, and I think we should try not to touch it. We should retore their build and let someone else fix the wrongness. > The logic appears to be trying to build a reverse lookup map for the entity table. Right, which is easy to do from HTMLEntityTable. > Honestly this code is just horribly wrong, and I think we should try not to touch it. We should retore their build and let someone else fix the wrongness. It seems very bad to have two out-of-sync lists of all the entities. Surely that's not a good idea. Yes, but this is clearly already an out-of-sync list. Between markup.cpp and this code. Why Chromium has chosen this arbitrary subset of HTML entities to replace when creating markup from a DOM is unclear. Why they have an explicit blacklist instead of a whitelist is also unclear. Moving them to the new list of entities will have a behavior change they may or may not want. Again, completely unclear from this code what they want. Either path forward is fine. My gut would say just restore the old broken behavior to fix their compile. We could also write new broken code for them. But that sounds like more work which seems silly to do since this whole approach/architecture/code-location seems wrong. :) I don't know if it is the right place to report the problem, but a unit test for MathJax detected the following problems with these Webkit-based browsers (on Windows 7): Safari 5.0.5: ⟨ and ⟩ are not converted to the new values. Chrome 11.0.696.71: ⋹̸ and ⋵̸ are converted to ¬inE; and ¬indot; Maybe it's fixed in development versions, but I just wanted to inform Webkit's developers. Please report new issues in new bugs. I have a few quick comments however. > Safari 5.0.5: ⟨ and ⟩ are not converted to the new values. This was a reckless incompatible spec change. We can update to the new mapping once it's implemented in other browsers, but not just because of a spec change. > Chrome 11.0.696.71: ⋹̸ and ⋵̸ are converted to ¬inE; and ¬indot; This is something we need to add support for (please file a new bug). Note this is correct parsing for an engine that doesn't support these character references. (In reply to comment #40) > Please report new issues in new bugs. I have a few quick comments however. I didn't know if this should be dealt in Webkit or if it specific to each browser. So I preferred to leave a comment here rather than opening duplicated or misclassified bugs. > > > Safari 5.0.5: ⟨ and ⟩ are not converted to the new values. > > This was a reckless incompatible spec change. We can update to the new mapping once it's implemented in other browsers, but not just because of a spec change. It's already in stable versions of Chrome and Firefox, at least (I don't know for the other browsers). > > > Chrome 11.0.696.71: ⋹̸ and ⋵̸ are converted to ¬inE; and ¬indot; > > This is something we need to add support for (please file a new bug). Note this is correct parsing for an engine that doesn't support these character references. OK. Again, should I report this in Webkit bugzilla or is it best to do that in a dedicated place for Chrome? > OK. Again, should I report this in Webkit bugzilla or is it best to do that in a dedicated place for Chrome?
WebKit please. Thanks!
> It's already in stable versions of Chrome and Firefox, at least (I don't know for the other browsers). If Chrome has updated, then the change is probably in mainline WebKit - I don't think that this is something they customize. Please test with a nightly build from http://nightly.webkit.org or with Safari 5.1 developer preview. (In reply to comment #43) > > It's already in stable versions of Chrome and Firefox, at least (I don't know for the other browsers). > > If Chrome has updated, then the change is probably in mainline WebKit - I don't think that this is something they customize. Please test with a nightly build from http://nightly.webkit.org or with Safari 5.1 developer preview. I just tested with a nightly build. The entities lang and rang are still mapped to the former values. > I just tested with a nightly build. The entities lang and rang are still mapped to the former values. The entities are stored here: http://trac.webkit.org/browser/trunk/Source/WebCore/html/parser/HTMLEntityNames.in They should be the same for all ports. In any case, please file a new bug (in this Bugzilla). (In reply to comment #45) > > I just tested with a nightly build. The entities lang and rang are still mapped to the former values. > > The entities are stored here: > > http://trac.webkit.org/browser/trunk/Source/WebCore/html/parser/HTMLEntityNames.in > > They should be the same for all ports. I just tested again with the latest release of Safari and lang and rang are now mapped to the new entities. Thanks. |