... and not retokenize the input.
Created attachment 100524 [details] WIP patch This is a work in progress patch. I welcome your feedback.
Performance results on PerformanceTests/Parser/html-parser.html are not really encouraging. Reference: avg 1736 median 1735.5 stdev 6.957010852370434 min 1717 max 1748 Patched: avg 1881.35 median 1882 stdev 4.33906671992953 min 1865 max 1886
The idea for this work is came from https://bugs.webkit.org/show_bug.cgi?id=63531.
You might have better luck caching AtomicHTMLTokens.
After some investigation I think the whole idea is wrong. I thought that the preload scanner scanning the input again after the parser but now it seems to be it does not. The preload scanner only scans text that the parser has not seen yet. In this case it is useless to feed it with "old" tokens. Adam, could you validate this for me? Thanks.
(In reply to comment #5) > After some investigation I think the whole idea is wrong. I thought that the preload scanner scanning the input again after the parser but now it seems to be it does not. The preload scanner only scans text that the parser has not seen yet. In this case it is useless to feed it with "old" tokens. Adam, could you validate this for me? Thanks. It is the other way around. The preload scanner should feed its tokens to the parser. There is no case where it would make sense for the parser to feed tokens to the preload scanner. For example, consider: 1. <img/> 2. <script src="slow.js"></script> 3. <img/> The Parser would tokenize #1 and #2, then the parser is forced to pause while the script downloads. In the mean time, the PreloadScanner would tokenize #3. After the script runs, the parser would have to retokenize #3 unless we save the work done by the preload scanner. There is a little more information in this bug: https://bugs.webkit.org/show_bug.cgi?id=57376 We should dupe one of them.
(In reply to comment #6) > (In reply to comment #5) > > After some investigation I think the whole idea is wrong. I thought that the preload scanner scanning the input again after the parser but now it seems to be it does not. The preload scanner only scans text that the parser has not seen yet. In this case it is useless to feed it with "old" tokens. Adam, could you validate this for me? Thanks. > > It is the other way around. The preload scanner should feed its tokens to the parser. There is no case where it would make sense for the parser to feed tokens to the preload scanner. > > For example, consider: > 1. <img/> > 2. <script src="slow.js"></script> > 3. <img/> > > The Parser would tokenize #1 and #2, then the parser is forced to pause while the script downloads. In the mean time, the PreloadScanner would tokenize #3. After the script runs, the parser would have to retokenize #3 unless we save the work done by the preload scanner. Thanks for the clarificatoin. > > There is a little more information in this bug: > https://bugs.webkit.org/show_bug.cgi?id=57376 > > We should dupe one of them. Yep, I am going to mark this one as a duplicate. *** This bug has been marked as a duplicate of bug 57376 ***