Modify HTML parser to support <template> element
Created attachment 127223 [details] WIP: First stab.
For the moment, this is a playground for me to learn about parser and understand what changes to the algorithm are necessary.
Comment on attachment 127223 [details] WIP: First stab. View in context: https://bugs.webkit.org/attachment.cgi?id=127223&action=review I know you said that this patch was a just a playground, but I thought the comments below might be helpful: > Source/WebCore/html/parser/HTMLTreeBuilder.cpp:2248 > + if (token.name() == templateTag) { > + m_tree.openElements()->pop(); > + return; > + } This change is slightly surprising. You could be popping any sort of element off the stack, not necessarily a template element. You probably either want: if (m_tree.currentNode()->hasTagName(templateTag)) { m_tree.openElements()->pop(); return; } or m_tree.openElements()->popUntilPopped(templateTag.localName()); resetInsertionModeAppropriately(); return; That's likely what's triggering your ASSERT on line 1686. > Source/WebCore/html/parser/HTMLTreeBuilder.cpp:2275 > + case TemplateContentsMode: > + m_tree.openElements()->pop(); > + resetInsertionModeAppropriately(); > + break; This parsing might be somewhat surprising: <div><template></div>Hello I think it would result in the following DOM: - <div> - <template> - #text: hello I'm not 100% sure what the best thing to do in this case is. Maybe: 1) If the end tag is </template>, then pop the open elements. 2) otherwise, pop the pop the open elements, reset the insertion mode, and reprocess the token ?
Comment on attachment 127223 [details] WIP: First stab. View in context: https://bugs.webkit.org/attachment.cgi?id=127223&action=review > LayoutTests/html5lib/resources/template.dat:187 > +| <table> > +| <thead> > +| <template> > +| <tr> > +| <td> This case is interesting. Where should implied <tr> appear?
Comment on attachment 127223 [details] WIP: First stab. View in context: https://bugs.webkit.org/attachment.cgi?id=127223&action=review >> LayoutTests/html5lib/resources/template.dat:187 >> +| <td> > > This case is interesting. Where should implied <tr> appear? Conceptually, from a parsing point of view, what you've written here is the most natural. The <td> creates the implied <tr>, and it would do so as a child of the currently open element, which is the <template>.
Comment on attachment 127223 [details] WIP: First stab. View in context: https://bugs.webkit.org/attachment.cgi?id=127223&action=review >> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:2248 >> + } > > This change is slightly surprising. You could be popping any sort of element off the stack, not necessarily a template element. You probably either want: > > if (m_tree.currentNode()->hasTagName(templateTag)) { > m_tree.openElements()->pop(); > return; > } > > or > > m_tree.openElements()->popUntilPopped(templateTag.localName()); > resetInsertionModeAppropriately(); > return; > > That's likely what's triggering your ASSERT on line 1686. Yay, thanks! I misunderstood what token means in this context. >> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:2275 >> + break; > > This parsing might be somewhat surprising: > > <div><template></div>Hello > > I think it would result in the following DOM: > > - <div> > - <template> > - #text: hello > > I'm not 100% sure what the best thing to do in this case is. Maybe: > > 1) If the end tag is </template>, then pop the open elements. > 2) otherwise, pop the pop the open elements, reset the insertion mode, and reprocess the token > > ? Oh, right. Yes. I think your plan makes sense.
There's a general question of how "scoping" you want </template> to be. In the <div><template></div>Hello example, there are two reasonable parses: - <div> - <template> - #text: Hello or - <div> - <template> - #text: Hello In the first case, <template> is strongly scoping, which means it's hard to escape from the template element's scope. In the second case, the template element isn't very scoping, which means the </div> token can find and close the div element. I'd recommend trying to match the scoping of down-rev parsers as much as feasible, which, in this case, means you'll want the second (less scoping) parsing.
Some examples: <div><object></div>Hello <div><foo></div>Hello These examples parse differently because <object> is pretty strongly scoping.
Created attachment 128299 [details] Moar better, moar tests.
Comment on attachment 128299 [details] Moar better, moar tests. View in context: https://bugs.webkit.org/attachment.cgi?id=128299&action=review This is starting to look really good. > Source/WebCore/html/parser/HTMLTreeBuilder.cpp:2281 > + case TemplateContentsMode: > + if (token.name() == templateTag) { > + m_tree.openElements()->pop(); > + resetInsertionModeAppropriately(); > + } > + break; Can you ASSERT here that the top element is a templateTag? > Source/WebCore/html/parser/HTMLTreeBuilder.cpp:2578 > + case TemplateContentsMode: > + // FIXME: What should we do here? Probably pop the <template> element off the stack of open element, reset the insertion mode appropriately, and reprocess the token. :)
Comment on attachment 128299 [details] Moar better, moar tests. Don't you hide <template> behind the flag?
(In reply to comment #11) > (From update of attachment 128299 [details]) > Don't you hide <template> behind the flag? Nah. This patch is just for me to learn what to write in the spec.
> Nah. This patch is just for me to learn what to write in the spec. Ah got it.
Created attachment 128579 [details] Faster, harder, stronger.
Dimtiri, can we close this now?
(In reply to comment #15) > Dimtiri, can we close this now? Yes sir! Please dupe it against the real bug for future archeologists.
*** This bug has been marked as a duplicate of bug 86031 ***