RESOLVED LATER 76515
Update HTML Parser so that <content> element should be treated transparently.
https://bugs.webkit.org/show_bug.cgi?id=76515
Summary Update HTML Parser so that <content> element should be treated transparently.
Hayato Ito
Reported 2012-01-17 21:47:19 PST
Suppose we have <content> element and given the following HTML: <table> <content> <td>hello</td> </content> </table> In the current HTML parser implementation, the result DOM tree would become: <content> </content> <table> <tbody> <tr> <td>hello</td> </tr> </tbody> </table> Yeah, <content> elements would be put outside of <table> element. The result should be something like this: <table> <content> <td>hello</td> </content> </table> I am not sure whether we should add <tbody> or <tr> somewhere in this particular case. At least, current implementation should be modified somehow.
Attachments
Sketch of possible changes (likely very broken) (2.42 KB, patch)
2012-01-19 16:27 PST, Adam Barth
no flags
Hayato Ito
Comment 1 2012-01-17 21:58:52 PST
I found the related spec here: http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#insertion-mode So we might modify this spec or add some description to shadow DOM spec.
Adam Barth
Comment 2 2012-01-17 23:46:17 PST
Changing the parser is a delicate matter. Please coordinate any parser changes with Hixie. We prefer to keep our implementation in close sync with the spec.
Adam Barth
Comment 3 2012-01-17 23:50:29 PST
Is there something I can read so I can understand what changes you'd like to make to the parser and what problem they solve? (Maybe something in the Explainer is relevant here?)
Hayato Ito
Comment 4 2012-01-18 01:10:18 PST
Okay. Although I don't have clear explanations yet about what the problem is exactly and what we should solve it, let me explain use cases using some examples. I might update later after I've read related specs carefully. Suppose we have the following component: (See https://dvcs.w3.org/hg/webcomponents/raw-file/tip/spec/shadow/index.html for shadow DOM spec). <div> - component (shadow-host) <td>light children1</td> <td>light children2</td> [shadow-root] - template follows: <table> <content select='xxxx'> <!- this is fallback elements --> <td>td1</td> <td>td2</td> </content> <td>td3</td> </table> If <content> element is replaced with the light children, this component should be rendered as if the given html was given: <div> <table> <tdoby> <tr> <td>light children1</td> <td>light children2</td> <td>td3</td> </tr> <tdoby> </table> </div> If <content> element is not replaced, this component should be rendered as if the given html was given: <div> <table> <tdoby> <tr> <td>td1</td> <td>td2</td> <td>td3</td> </tr> <tdoby> </table> </div> To achieve it, what dom tree should HTML Parser generate when parsing the template? According to the current spec, <content> element is *invalid* in "IN_TABLE" mode when parsing, so <content> element would be inserted outside of <table> element like this: (See also http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#foster-parenting) <content select='xxxx'> </content> <table> <tbody> <tr> <!- this is fallback elements --> <td>td1</td> <td>td2</td> <td>td3</td> </tr> </tdoby> </table> This is undesirable result. We lost the position of <content> element. So we should treat <content> element specially in <table> or such elements. * Note that <tdoby> and <tr> elements are inserted by parser. So what DOM tree should the parser generate in this case? I've come up three rough ideas. 1). Insert <tbody> and <tr> inside of <content> element. <table> <content select='xxxx'> <tdoby> <tr> <td>td1</td> <td>td2</td> </tr> </tbody> </content> <td>td3</td> </table> This is apparently wrong in this case. 2). Insert <tbody> and <tr> outside of <content> element and its siblings. <table> <tdoby> <tr> <content select='xxxx'> <td>td1</td> <td>td2</td> </content> <td>td3</td> </tr> </tbody> </table> This looks good in this case. 3). Avoid such auto-inserting elements across <content> elements. Return DOM tree 'as is'. <table> <content select='xxxx'> <td>td1</td> <td>td2</td> </content> <td>td3</td> </table> This is ideal one? Can we achieve that? Please correct or add anything if you find. I might miss something. I don't understand how such modification for the parser is tough.
Adam Barth
Comment 5 2012-01-18 01:19:54 PST
> I don't understand how such modification for the parser is tough. It's not the modifications themselves that are tough, it's understanding all the consequences. Have you considered how these documents will be handled by user agents that don't understand these new parsing rules? What about other insertion modes besides InTable? I'm very hesitant to make changes to the parser without those changes first appearing in the HTML living standard. The cost of having our parser diverge from the standard and from all the other browsers (who now implement the standard) is much higher than in other areas, such as JavaScript APIs.
Hayato Ito
Comment 6 2012-01-18 02:49:52 PST
Hi Adam, thank you for the comment. I totally agree your concerns. We have to consider all the consequences in addition to the spec change. Let us continue to pursuit what should be done and use this bug for tracking the issue and updating the status. (In reply to comment #5) > > I don't understand how such modification for the parser is tough. > > It's not the modifications themselves that are tough, it's understanding all the consequences. Have you considered how these documents will be handled by user agents that don't understand these new parsing rules? What about other insertion modes besides InTable? > > I'm very hesitant to make changes to the parser without those changes first appearing in the HTML living standard. The cost of having our parser diverge from the standard and from all the other browsers (who now implement the standard) is much higher than in other areas, such as JavaScript APIs.
Dimitri Glazkov (Google)
Comment 7 2012-01-18 08:08:46 PST
I think this is a bit of a cart-before-horse situation here. The spec as I wrote it doesn't -- yet -- specify any changes to the parser.
Hayato Ito
Comment 8 2012-01-18 18:57:09 PST
I know we are starting the discussion in other place, http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2012-January/034410.html but let me add one more note here because the example I used in the explanation might not be good example. Focus on the shadow host itself, ignoring shadow tree: <div> <td>light children1</td> <td>light children2</td> </div> This is totally invalid. If this host tree is parsed, <td> is gone because it do not appear in IN_TABLE mode. That becomes just a text node like: <div> light children1 light children2 </div> So we can not use light children which are valid only when inserted into <content>. I just want to let you notify this limitation and my example is not good one.
Adam Barth
Comment 9 2012-01-19 16:27:34 PST
Created attachment 123220 [details] Sketch of possible changes (likely very broken)
Dominic Cooney
Comment 10 2013-01-06 19:41:35 PST
Per Comment 7, although the behavior is… interesting… it is per the spec. I’m closing this for now.
Note You need to log in before you can comment on or make changes to this bug.