Bug 45271 - HTML parser should provide script column position within HTML document to JavaScript engine
Summary: HTML parser should provide script column position within HTML document to Jav...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 48007 48249
Blocks:
  Show dependency treegraph
 
Reported: 2010-09-06 10:23 PDT by Peter Rybin
Modified: 2010-11-10 12:23 PST (History)
13 users (show)

See Also:


Attachments
draft patch (not inteded for code review), only for build bots (30.62 KB, patch)
2010-09-06 10:30 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Another upload to trigger bots (47 bytes, text/plain)
2010-09-06 11:41 PDT, anton muhin
no flags Details
Patch (38.63 KB, patch)
2010-09-06 11:56 PDT, anton muhin
abarth: review-
Details | Formatted Diff | Diff
Patch (68.84 KB, patch)
2010-09-20 06:44 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch (69.45 KB, patch)
2010-09-22 10:35 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch (57.85 KB, patch)
2010-09-23 14:25 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch (57.86 KB, patch)
2010-09-24 11:11 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch (57.07 KB, patch)
2010-09-24 14:21 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch (57.76 KB, patch)
2010-10-07 05:49 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch (57.72 KB, patch)
2010-10-08 07:31 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch (58.58 KB, patch)
2010-10-19 15:02 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch (58.65 KB, patch)
2010-10-20 07:47 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch (59.58 KB, patch)
2010-10-23 15:02 PDT, Peter Rybin
no flags Details | Formatted Diff | Diff
Patch to land (59.50 KB, patch)
2010-11-10 05:12 PST, Peter Rybin
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Rybin 2010-09-06 10:23:41 PDT
WebKit parses HTML and passes all found script texts to JavaScript engine (usually JSC or V8). JavaScript engine also receives line number that corresponds to the start of script source within HTML document.

The engine should also receive column of script start. E.g. in WebCore/bindings/v8/V8Proxy.cpp in method v8::Handle<v8::Script> V8Proxy::compileScript has "baseLine" parameter and lacks "baseColumn" parameter.

The column number is essential for:
- correct column number in error message for one-liner scripts
- correct work of script live editing: if user edited HTML with 2 scripts in one line, there's no way of telling which script must be updated in VM, becuase they both have the same line number.
  E.g. :
    ... onClick="javascript:HandleClick('main')" onFocus="javascript:HandleFocus('main')" ...
Comment 1 Peter Rybin 2010-09-06 10:30:46 PDT
Created attachment 66657 [details]
draft patch (not inteded for code review), only for build bots
Comment 2 WebKit Commit Bot 2010-09-06 10:47:22 PDT
Comment on attachment 66657 [details]
draft patch (not inteded for code review), only for build bots

Rejecting patch 66657 from commit-queue.

peter.rybin@gmail.com does not have committer permissions according to http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/common/config/committers.py.

- If you do not have committer rights please read http://webkit.org/coding/contributing.html for instructions on how to use bugzilla flags.

- If you have committer rights please correct the error in WebKitTools/Scripts/webkitpy/common/config/committers.py by adding yourself to the file (no review needed).  The commit-queue restarts itself every 2 hours.  After restart the commit-queue will correctly respect your committer rights.
Comment 3 anton muhin 2010-09-06 11:37:36 PDT
Comment on attachment 66657 [details]
draft patch (not inteded for code review), only for build bots

Cleaning up before I rerequest to run it through BBs.
Comment 4 anton muhin 2010-09-06 11:38:15 PDT
Comment on attachment 66657 [details]
draft patch (not inteded for code review), only for build bots

Please, do not review---let's run it through BBs first.
Comment 5 anton muhin 2010-09-06 11:41:45 PDT
Created attachment 66663 [details]
Another upload to trigger bots
Comment 6 WebKit Review Bot 2010-09-06 11:43:34 PDT
Attachment 66663 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1
Total errors found: 0 in 0 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 7 anton muhin 2010-09-06 11:56:20 PDT
Created attachment 66664 [details]
Patch
Comment 8 Eric Seidel (no email) 2010-09-06 12:15:30 PDT
Attachment 66664 [details] did not build on mac:
Build output: http://queues.webkit.org/results/3968232
Comment 9 Adam Barth 2010-09-07 12:27:41 PDT
Comment on attachment 66664 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=66664&action=prettypatch

R- for lack of testing and for likely performance problems.  Check out how view source figures out columns.  It's fast, works, and has some amount of testing.

> WebCore/ChangeLog:8
> +        TODO(peter): fill in description.
Would be nice to have a description.

> WebCore/ChangeLog:10
> +        No new tests. (OOPS!)
Tests?

> WebCore/bindings/v8/V8Proxy.cpp:402
> -        v8::Handle<v8::Script> script = compileScript(code, source.url(), source.startLine() - 1, scriptData.get());
> +        v8::Handle<v8::Script> script = compileScript(code, source.url(), source.startLine() - 1, source.startColumn() - 1, scriptData.get());
This whole +1/-1 thing is sadness.  We should leverage the C++ type system to help us here.  V8LineNumber / ScriptLineNumber or some such.

> WebCore/html/parser/HTMLScriptRunner.cpp:312
> -void HTMLScriptRunner::runScript(Element* script, int startingLineNumber)
> +void HTMLScriptRunner::runScript(Element* script, int startingLineNumber, int startingColumnNumber)
Does it make sense to put the line number and the column number together to a struct?  ScriptPosition?  Seems like you're touching all these lines of code anyway.

> WebCore/platform/text/SegmentedString.cpp:237
> +          ++cursorPosition.column;
Did you run the parser benchmark?  I bet this costs a bunch of time.  If you look at how the view source code figures out column numbers, you'll find a much faster way.
Comment 10 Peter Rybin 2010-09-20 06:44:35 PDT
Created attachment 68076 [details]
Patch
Comment 11 Early Warning System Bot 2010-09-20 06:52:38 PDT
Attachment 68076 [details] did not build on qt:
Build output: http://queues.webkit.org/results/4044079
Comment 12 WebKit Review Bot 2010-09-20 06:59:12 PDT
Attachment 68076 [details] did not build on gtk:
Build output: http://queues.webkit.org/results/4069082
Comment 13 WebKit Review Bot 2010-09-20 07:11:50 PDT
Attachment 68076 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/4001062
Comment 14 WebKit Review Bot 2010-09-20 08:22:11 PDT
Attachment 68076 [details] did not build on win:
Build output: http://queues.webkit.org/results/4012095
Comment 15 Adam Barth 2010-09-20 10:41:19 PDT
Comment on attachment 68076 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=68076&action=review

I like the approach with the specific types, but there are a lot of errors in this patch.  Also please post html-parser benchmark scores with your patch.  This code is highly performance sensitive.

> JavaScriptCore/JavaScriptCore.gypi:388
> +            'wtf/HTMLPosition.h',
> +            'wtf/HTMLPosition.cpp',

This isn't the right place for these files.  WTF doesn't know anything about HTML.  They should probably go in WebCore/html/parser.

> JavaScriptCore/wtf/HTMLPosition.h:86
> +/* An int wrapper that always reminds you that the number should be 0-based */

We generally use C++ style comments.

> WebCore/bindings/js/ScriptSourceCode.h:46
> +    ScriptSourceCode(const String& source, const KURL& url = KURL(),
> +        const HTMLPosition1& startPosition = HTMLPosition1::minimumPosition())

Please put this on one line.

> WebCore/bindings/v8/ScriptController.cpp:263
> +HTMLPosition0 ScriptController::eventHandlerPosition() const

What is HTMLPosition0 ?

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1372
> +    // TODO(peter.rybin): the implementation probably return 1-based int, but method should return 0-based.

WebKit uses FIXME instead of TODO(name).  Also, this is not a complete sentence.

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1374
> +    int res = (context() ? context()->input->line : 1) - 1;
> +    printf("line number %d\n", res);

We don't usually call printf.  Also, "res" is a pretty unintelligible variable name.

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1391
> +    if (context)
> +        return HTMLPosition0(WTF::ZeroBasedNumber::fromZeroBasedInt(context->input->line - 1),
> +            WTF::ZeroBasedNumber::fromZeroBasedInt(context->input->col - 1), 0);

This multi-line if should use curly braces.

This code doem't make much sense.  It looks like context->input is one-based.  Shouldn't we create a one-based number from that and then covert that to a zero-based number instead of subtracting manually?

> WebCore/html/parser/HTMLDocumentParser.cpp:84
> -    , m_treeBuilder(HTMLTreeBuilder::create(m_tokenizer.get(), document, reportErrors))
> +    , m_treeBuilder(HTMLTreeBuilder::create(m_tokenizer.get(), this, document, reportErrors))

This change is not correct.  The HTMLTreeBuilder isn't allowed to know about the HTMLDocumentParser.  There are going to be situations in the future where the HTMLTreeBuilder is used without an HTMLDocumentParser.

> WebCore/html/parser/HTMLDocumentParser.cpp:421
> +    int line = m_input.current().getCurrentLine();
> +    int column = m_input.current().getCurrentColumn();
> +    int generation = m_input.getInsertionLevel();

Generally, we don't use a "get" prefix for getters.  We just call the methods by the thing they return.  Why isn't m_input.current() returning a HTMLPosition?

> WebCore/html/parser/HTMLInputStream.h:53
> -        : m_last(&m_first)
> +        : m_last(&m_first), m_insertionLevel(0)

The HTMLInputStream should not have an insertionLevel.  That's a concept of the HTMLDocumentParser.  (Also, it's called a nesting level.)

> WebCore/html/parser/HTMLInputStream.h:145
> +        m_insertionLevel = m_inputStream->insertionLevel();
> +        m_inputStream->setInsertionLevel(m_insertionLevel + 1);

This is redundantly tracking the nestingLevel, which is already tracked carefully in the HTMLDocumentParser.

> WebCore/html/parser/HTMLTreeBuilder.cpp:2748
> +    HTMLPosition0 position = m_scriptableDocumentParser->htmlPosition();

How can a ScriptableDocumentParser have an HTMLPosition?  ScriptableDocumentParser does not necessary have anything to do with HTML.  This makes no sense.

> WebCore/platform/text/SegmentedString.cpp:239
> +        if (*m_currentString.m_current++ == '\n' && m_currentString.doNotExcludeLineNumbers()) {
> +          ++lineNumber;
> +          ++m_currentLine;
> +          m_lineBeginCharactersConsumedValue = numberOfCharactersConsumed();
> +        }

Wrong indent.

> WebCore/platform/text/SegmentedString.cpp:254
> +int SegmentedString::currentLine() const
> +{
> +    return m_currentLine;
> +}
> +
> +int SegmentedString::currentColumn() const
> +{
> +    return numberOfCharactersConsumed() - m_lineBeginCharactersConsumedValue;
> +}

Why not just return a position?

> WebCore/platform/text/SegmentedString.h:166
> +            if (m_currentString.doNotExcludeLineNumbers()) {
> +                ++lineNumber;
> +                ++m_currentLine;
> +                m_lineBeginCharactersConsumedValue = numberOfCharactersConsumed();
> +            }

I'm confused.  I though I did all this work already for the view source parser.  Also, this code explicitly uses += to avoid a branch here.  Did you run the parsing benchmark???
Comment 16 Eric Seidel (no email) 2010-09-20 10:46:04 PDT
This is the benchmark to which Adam refers:
http://trac.webkit.org/browser/trunk/WebCore/benchmarks/parser/html-parser.html
Comment 17 Peter Rybin 2010-09-22 10:32:07 PDT
(In reply to comment #15)
> (From update of attachment 68076 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=68076&action=review
> 
> I like the approach with the specific types, but there are a lot of errors in this patch.  Also please post html-parser benchmark scores with your patch.  This code is highly performance sensitive.


Hi Adam
Thanks a lot for the review. I sorry for posting a code with so many errors.

I guess there are 2 issues here:

1. HTMLPosition

a. Name. It might be a wrong name actually.
It's a simple structure of line/column, but I also had to add a third field 'generation'. I need to reflect the fact that some 'script' (and other) elements may be generated in 'document.write' call. The positions go to JavaScript debugger and it is important that it treats positions of generated script differently; possibly it doesn't need positions for the generated script at all. I am not sure I chose the best solution possible.

b. Scope. 'HTMLPosition' goes together with the script data everywhere, so it seems okay for ScriptableDocumentParser to depend on it.


2. Column in SegmentedString.
Unfortunately I couldn't read your hint about view source parser. I haven't found anything related to columns or something I could use. What do I miss there?

I am running a test that Eric Seidel (thanks to him!) gave me a reference to. In the test_shell from Chromium, release mode, nice -15.
The numbers in both cases are very divergent, so I'd like to make some more runs now and post result here later.

Peter


> 
> > JavaScriptCore/JavaScriptCore.gypi:388
> > +            'wtf/HTMLPosition.h',
> > +            'wtf/HTMLPosition.cpp',
> 
> This isn't the right place for these files.  WTF doesn't know anything about HTML.  They should probably go in WebCore/html/parser.
> 
> > JavaScriptCore/wtf/HTMLPosition.h:86
> > +/* An int wrapper that always reminds you that the number should be 0-based */
> 
> We generally use C++ style comments.
> 
> > WebCore/bindings/js/ScriptSourceCode.h:46
> > +    ScriptSourceCode(const String& source, const KURL& url = KURL(),
> > +        const HTMLPosition1& startPosition = HTMLPosition1::minimumPosition())
> 
> Please put this on one line.
> 
> > WebCore/bindings/v8/ScriptController.cpp:263
> > +HTMLPosition0 ScriptController::eventHandlerPosition() const
> 
> What is HTMLPosition0 ?
> 
> > WebCore/dom/XMLDocumentParserLibxml2.cpp:1372
> > +    // TODO(peter.rybin): the implementation probably return 1-based int, but method should return 0-based.
> 
> WebKit uses FIXME instead of TODO(name).  Also, this is not a complete sentence.
> 
> > WebCore/dom/XMLDocumentParserLibxml2.cpp:1374
> > +    int res = (context() ? context()->input->line : 1) - 1;
> > +    printf("line number %d\n", res);
> 
> We don't usually call printf.  Also, "res" is a pretty unintelligible variable name.
> 
> > WebCore/dom/XMLDocumentParserLibxml2.cpp:1391
> > +    if (context)
> > +        return HTMLPosition0(WTF::ZeroBasedNumber::fromZeroBasedInt(context->input->line - 1),
> > +            WTF::ZeroBasedNumber::fromZeroBasedInt(context->input->col - 1), 0);
> 
> This multi-line if should use curly braces.
> 
> This code doem't make much sense.  It looks like context->input is one-based.  Shouldn't we create a one-based number from that and then covert that to a zero-based number instead of subtracting manually?
> 
> > WebCore/html/parser/HTMLDocumentParser.cpp:84
> > -    , m_treeBuilder(HTMLTreeBuilder::create(m_tokenizer.get(), document, reportErrors))
> > +    , m_treeBuilder(HTMLTreeBuilder::create(m_tokenizer.get(), this, document, reportErrors))
> 
> This change is not correct.  The HTMLTreeBuilder isn't allowed to know about the HTMLDocumentParser.  There are going to be situations in the future where the HTMLTreeBuilder is used without an HTMLDocumentParser.
> 
> > WebCore/html/parser/HTMLDocumentParser.cpp:421
> > +    int line = m_input.current().getCurrentLine();
> > +    int column = m_input.current().getCurrentColumn();
> > +    int generation = m_input.getInsertionLevel();
> 
> Generally, we don't use a "get" prefix for getters.  We just call the methods by the thing they return.  Why isn't m_input.current() returning a HTMLPosition?
> 
> > WebCore/html/parser/HTMLInputStream.h:53
> > -        : m_last(&m_first)
> > +        : m_last(&m_first), m_insertionLevel(0)
> 
> The HTMLInputStream should not have an insertionLevel.  That's a concept of the HTMLDocumentParser.  (Also, it's called a nesting level.)
> 
> > WebCore/html/parser/HTMLInputStream.h:145
> > +        m_insertionLevel = m_inputStream->insertionLevel();
> > +        m_inputStream->setInsertionLevel(m_insertionLevel + 1);
> 
> This is redundantly tracking the nestingLevel, which is already tracked carefully in the HTMLDocumentParser.
> 
> > WebCore/html/parser/HTMLTreeBuilder.cpp:2748
> > +    HTMLPosition0 position = m_scriptableDocumentParser->htmlPosition();
> 
> How can a ScriptableDocumentParser have an HTMLPosition?  ScriptableDocumentParser does not necessary have anything to do with HTML.  This makes no sense.
> 
> > WebCore/platform/text/SegmentedString.cpp:239
> > +        if (*m_currentString.m_current++ == '\n' && m_currentString.doNotExcludeLineNumbers()) {
> > +          ++lineNumber;
> > +          ++m_currentLine;
> > +          m_lineBeginCharactersConsumedValue = numberOfCharactersConsumed();
> > +        }
> 
> Wrong indent.
> 
> > WebCore/platform/text/SegmentedString.cpp:254
> > +int SegmentedString::currentLine() const
> > +{
> > +    return m_currentLine;
> > +}
> > +
> > +int SegmentedString::currentColumn() const
> > +{
> > +    return numberOfCharactersConsumed() - m_lineBeginCharactersConsumedValue;
> > +}
> 
> Why not just return a position?
> 
> > WebCore/platform/text/SegmentedString.h:166
> > +            if (m_currentString.doNotExcludeLineNumbers()) {
> > +                ++lineNumber;
> > +                ++m_currentLine;
> > +                m_lineBeginCharactersConsumedValue = numberOfCharactersConsumed();
> > +            }
> 
> I'm confused.  I though I did all this work already for the view source parser.  Also, this code explicitly uses += to avoid a branch here.  Did you run the parsing benchmark???
Comment 18 Peter Rybin 2010-09-22 10:35:29 PDT
Created attachment 68392 [details]
Patch
Comment 19 Adam Barth 2010-09-22 11:25:06 PDT
Comment on attachment 68392 [details]
Patch

You didn't address my comments on the previous patch.  You didn't provide benchmark scores.  Why do we need an HTMLPositionProvider?  You should just put the position information on the token since that's the thing that has a position.
Comment 20 Peter Rybin 2010-09-23 14:03:54 PDT
Comment on attachment 68076 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=68076&action=review

Hi Adam

I am splitting this long patch into 2 parts because it seems to be too inconvenient.
The upcoming part is about introducing TextPosition structure with 0/1-based numbers. I need it for passing column data from parser to JS engine.

Last time I failed to answer your inline comments. I'm sorry about it. I believe I addressed all your comments.

>>> JavaScriptCore/JavaScriptCore.gypi:388
>>> +            'wtf/HTMLPosition.cpp',
>> 
>> This isn't the right place for these files.  WTF doesn't know anything about HTML.  They should probably go in WebCore/html/parser.
> 
> 

Actually this sturcutre is going to be used wider than in WebCore/html/parser.
Since it's actually not HTML-specific, I renamed it into TextPosition.

>> JavaScriptCore/wtf/HTMLPosition.h:86
>> +/* An int wrapper that always reminds you that the number should be 0-based */
> 
> We generally use C++ style comments.

Done

>> WebCore/bindings/js/ScriptSourceCode.h:46
>> +        const HTMLPosition1& startPosition = HTMLPosition1::minimumPosition())
> 
> Please put this on one line.

Done

>> WebCore/bindings/v8/ScriptController.cpp:263
>> +HTMLPosition0 ScriptController::eventHandlerPosition() const
> 
> What is HTMLPosition0 ?

It's a variant of generic HTMLPosition for zero-based numbers. There's also HTMLPosition1. I chose this super-short form, don't know whether it's okay though.
Now all known as 'TextPosition*'
I hope to merge types  TextPosition0 + TextPosition1 => TextPosition  once we don't care about internal representation.

>> WebCore/dom/XMLDocumentParserLibxml2.cpp:1372
>> +    // TODO(peter.rybin): the implementation probably return 1-based int, but method should return 0-based.
> 
> WebKit uses FIXME instead of TODO(name).  Also, this is not a complete sentence.

Done.

>> WebCore/dom/XMLDocumentParserLibxml2.cpp:1374
>> +    printf("line number %d\n", res);
> 
> We don't usually call printf.  Also, "res" is a pretty unintelligible variable name.

I'm sorry, this shouldn't have stayed here.
Done

>> WebCore/dom/XMLDocumentParserLibxml2.cpp:1391
>> +            WTF::ZeroBasedNumber::fromZeroBasedInt(context->input->col - 1), 0);
> 
> This multi-line if should use curly braces.
> 
> This code doem't make much sense.  It looks like context->input is one-based.  Shouldn't we create a one-based number from that and then covert that to a zero-based number instead of subtracting manually?

Done
I did like the second part of comment.

>> WebCore/html/parser/HTMLDocumentParser.cpp:84
>> +    , m_treeBuilder(HTMLTreeBuilder::create(m_tokenizer.get(), this, document, reportErrors))
> 
> This change is not correct.  The HTMLTreeBuilder isn't allowed to know about the HTMLDocumentParser.  There are going to be situations in the future where the HTMLTreeBuilder is used without an HTMLDocumentParser.

Done
I dropped actual position providing out of patch scope for now.

>> WebCore/html/parser/HTMLDocumentParser.cpp:421
>> +    int generation = m_input.getInsertionLevel();
> 
> Generally, we don't use a "get" prefix for getters.  We just call the methods by the thing they return.  Why isn't m_input.current() returning a HTMLPosition?

Done.
My main concern was that HTMLPosition has 3 field while only 2 of them would be used here.
I dropped actual position providing out of patch scope for now.

>> WebCore/html/parser/HTMLInputStream.h:53
>> +        : m_last(&m_first), m_insertionLevel(0)
> 
> The HTMLInputStream should not have an insertionLevel.  That's a concept of the HTMLDocumentParser.  (Also, it's called a nesting level.)

Done

>> WebCore/html/parser/HTMLInputStream.h:145
>> +        m_inputStream->setInsertionLevel(m_insertionLevel + 1);
> 
> This is redundantly tracking the nestingLevel, which is already tracked carefully in the HTMLDocumentParser.

Done

>> WebCore/html/parser/HTMLTreeBuilder.cpp:2748
>> +    HTMLPosition0 position = m_scriptableDocumentParser->htmlPosition();
> 
> How can a ScriptableDocumentParser have an HTMLPosition?  ScriptableDocumentParser does not necessary have anything to do with HTML.  This makes no sense.

The script needs this type. I renamed it into TextPosition now, meaning that it's a position in any text document (HTML, XML etc)

>> WebCore/platform/text/SegmentedString.cpp:239
>> +        }
> 
> Wrong indent.

Done. Sorry, early patch upload.

>> WebCore/platform/text/SegmentedString.cpp:254
>> +}
> 
> Why not just return a position?

I was afraid that HTMLPosition has 3 fields while I need only 2 of them here.

>> WebCore/platform/text/SegmentedString.h:166
>> +            }
> 
> I'm confused.  I though I did all this work already for the view source parser.  Also, this code explicitly uses += to avoid a branch here.  Did you run the parsing benchmark???

I dropped this part out of patch for now.

However I can't find a way of putting 'numberOfCharactersConsumed()' call out of the branch.
I'm probably showing myself stupid, but I can't read your 'I did all this work already for the view source parser'. Unfortunately I couldn't find anything related to columns in HTMLViewSourceParser or related classes.
Comment 21 Peter Rybin 2010-09-23 14:25:23 PDT
Created attachment 68589 [details]
Patch
Comment 22 Early Warning System Bot 2010-09-23 14:38:14 PDT
Attachment 68589 [details] did not build on qt:
Build output: http://queues.webkit.org/results/4071116
Comment 23 WebKit Review Bot 2010-09-23 18:30:20 PDT
Attachment 68589 [details] did not build on win:
Build output: http://queues.webkit.org/results/4049143
Comment 24 Peter Rybin 2010-09-24 11:11:25 PDT
Created attachment 68715 [details]
Patch
Comment 25 Adam Barth 2010-09-24 11:27:30 PDT
Comment on attachment 68715 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=68715&action=review

This patch looks great.  I'd like to see one more iteration to address the comments below.  Can we drop generation from this patch?  It doesn't seem to be used.  We can add it (or something like it) in the next patch when we use it.  Thanks for sticking with it.

> JavaScriptCore/GNUmakefile.am:517
> +	JavaScriptCore/wtf/TextPosition.cpp \
> +	JavaScriptCore/wtf/TextPosition.h \

These should probably be in JavaScriptCore/wtf/text

> JavaScriptCore/wtf/TextPosition.h:39
> + * for saving script source position. The third field 'generation' reflects the fact
> + * that in HTML document its any part can be generated on the fly, by a 'document.write'
> + * call. While elements of the generated part may still have coordinates (probably),

I'm not sure generation is appropriate for WTF.  WTF doesn't know anything about document.write.

> JavaScriptCore/wtf/TextPosition.h:86
> +    int m_generation;

This doesn't seem right here.

> WebCore/bindings/js/ScriptSourceCode.h:45
> +    ScriptSourceCode(const String& source, const KURL& url = KURL(), const TextPosition1& startPosition = TextPosition1::minimumPosition())

TextPosition1 => Does this build?

> WebCore/bindings/v8/ScriptEventListener.cpp:54
> +    // FIXME: very strange: we initialize zero-based number with '1'.
> +    TextPosition0 position(WTF::ZeroBasedNumber::fromZeroBasedInt(1), WTF::ZeroBasedNumber::s_base, 0);

Yes, I think this code is wrong.  :)

Let's fix it in another patch.

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1373
> +    // FIXME: the implementation probably returns 1-based int, but method should return 0-based.
> +    //   New "- 1" fixes this. Is it correct?

I'm confused.  Are there tests that break with or without the -1?  We need tests to cover this.  If there are tests, we can skip the comment.  If there are no tests, I'd leave the code as-is, can keep the first line of the comment (with a capital "T").  We can then write the test and fix the bug in a followup patch.

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1381
> +    // FIXME: the implementation probably returns 1-based int, but method should return 0-based.
> +    //   New "- 1" fixes this. Is it correct?
> +    return  (context() ? context()->input->col : 1) - 1;

Same here.
Comment 26 Early Warning System Bot 2010-09-24 11:37:14 PDT
Attachment 68715 [details] did not build on qt:
Build output: http://queues.webkit.org/results/4086071
Comment 27 Peter Rybin 2010-09-24 14:02:53 PDT
Comment on attachment 68715 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=68715&action=review

Hi Adam.
I agree on simplifying the patch. The 'generation' field doesn't seem extremely elegant, however I need some way of telling that the script does not belong to original text (i.e. it is generated). It's not necessarily a position property, but at least it's something very close.

The only problem I'm still having is making "build-webkit --qt" use XmlDocumentParserQt.cc rather that *Libxml2.cc. That's why I've uploaded bad patches so often.
Peter

>> JavaScriptCore/GNUmakefile.am:517
>> +	JavaScriptCore/wtf/TextPosition.h \
> 
> These should probably be in JavaScriptCore/wtf/text

Done

>> JavaScriptCore/wtf/TextPosition.h:39
>> + * call. While elements of the generated part may still have coordinates (probably),
> 
> I'm not sure generation is appropriate for WTF.  WTF doesn't know anything about document.write.

Done

>> JavaScriptCore/wtf/TextPosition.h:86
>> +    int m_generation;
> 
> This doesn't seem right here.

Done

>> WebCore/bindings/js/ScriptSourceCode.h:45
>> +    ScriptSourceCode(const String& source, const KURL& url = KURL(), const TextPosition1& startPosition = TextPosition1::minimumPosition())
> 
> TextPosition1 => Does this build?

Initializing const reference with a function call? Apparently it does.

>> WebCore/bindings/v8/ScriptEventListener.cpp:54
>> +    TextPosition0 position(WTF::ZeroBasedNumber::fromZeroBasedInt(1), WTF::ZeroBasedNumber::s_base, 0);
> 
> Yes, I think this code is wrong.  :)
> 
> Let's fix it in another patch.

Done

>> WebCore/dom/XMLDocumentParserLibxml2.cpp:1373
>> +    //   New "- 1" fixes this. Is it correct?
> 
> I'm confused.  Are there tests that break with or without the -1?  We need tests to cover this.  If there are tests, we can skip the comment.  If there are no tests, I'd leave the code as-is, can keep the first line of the comment (with a capital "T").  We can then write the test and fix the bug in a followup patch.

Done #2

>> WebCore/dom/XMLDocumentParserLibxml2.cpp:1381
>> +    return  (context() ? context()->input->col : 1) - 1;
> 
> Same here.

Done #2
Comment 28 Peter Rybin 2010-09-24 14:21:39 PDT
Created attachment 68746 [details]
Patch
Comment 29 Early Warning System Bot 2010-09-24 14:32:28 PDT
Attachment 68746 [details] did not build on qt:
Build output: http://queues.webkit.org/results/4130001
Comment 30 Adam Barth 2010-09-24 15:24:12 PDT
Once you get qt to compile, I'll give this a final look through.
Comment 31 WebKit Review Bot 2010-09-24 19:51:56 PDT
Attachment 68746 [details] did not build on win:
Build output: http://queues.webkit.org/results/4061110
Comment 32 Peter Rybin 2010-09-26 15:36:13 PDT
(In reply to comment #30)
> Once you get qt to compile, I'll give this a final look through.

Hi Adam. I'm going to be on vacation until 9th of October. I can't upload patch right now, because there is some problem layout tests identify. Peter
Comment 33 Peter Rybin 2010-10-07 05:49:03 PDT
Created attachment 70071 [details]
Patch
Comment 34 Peter Rybin 2010-10-07 07:04:19 PDT
Comment on attachment 70071 [details]
Patch

Hi Adam

I hope I finally made it as accurate as possible. I should be 100% (bug-)consistent with the current code.

Peter
Comment 35 Adam Barth 2010-10-07 10:07:30 PDT
Comment on attachment 70071 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=70071&action=review

This looks fantastic.  Thanks for the patch.  A couple very minor nits below.

> JavaScriptCore/wtf/text/TextPosition.h:32
> +

Extra line.

> JavaScriptCore/wtf/text/TextPosition.h:41
> + *

Extra line.

> JavaScriptCore/wtf/text/TextPosition.h:80
> +

extra line

> JavaScriptCore/wtf/text/TextPosition.h:141
> +

extra line

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1392
> +    if (context) {

prefer early return.

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1407
> +    if (context) {

prefer early return
Comment 36 Peter Rybin 2010-10-08 07:31:12 PDT
Created attachment 70250 [details]
Patch
Comment 37 Adam Barth 2010-10-08 09:41:58 PDT
Comment on attachment 70250 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=70250&action=review

> JavaScriptCore/wtf/text/TextPosition.cpp:36
> +

extra blank line
Comment 38 WebKit Commit Bot 2010-10-08 09:56:42 PDT
Comment on attachment 70250 [details]
Patch

Rejecting patch 70250 from commit-queue.

Failed to run "['./WebKitTools/Scripts/webkit-patch', '--status-host=queues.webkit.org', 'build', '--no-clean', '--no-update', '--build', '--build-style=both', '--quiet']" exit_code: 2
Building WebKit
Failed to run "['WebKitTools/Scripts/build-webkit', '--debug']" exit_code: 1

Full output: http://queues.webkit.org/results/4227165
Comment 39 Yury Semikhatsky 2010-10-14 02:49:10 PDT
(In reply to comment #36)
> Created an attachment (id=70250) [details]
> Patch

The change doesn't compile on Mac. One of the reasons is that TextPosition.* are not in the XCode project but even after I added them to the project it still wouldn't compile.
Comment 40 Peter Rybin 2010-10-19 15:02:38 PDT
Created attachment 71204 [details]
Patch
Comment 41 Adam Barth 2010-10-19 17:56:52 PDT
Comment on attachment 71204 [details]
Patch

ok, let's give this a try.
Comment 42 WebKit Commit Bot 2010-10-19 19:36:18 PDT
Comment on attachment 71204 [details]
Patch

Rejecting patch 71204 from commit-queue.

Failed to run "['./WebKitTools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=abarth-cq-sl', 'land-attachment', '--force-clean', '--ignore-builders', '--quiet', '--non-interactive', '--parent-command=commit-queue', 71204]" exit_code: 1
Logging in as commit-queue@webkit.org...
Fetching: https://bugs.webkit.org/attachment.cgi?id=71204&action=edit
Fetching: https://bugs.webkit.org/show_bug.cgi?id=45271&ctype=xml
Processing 1 patch from 1 bug.
Cleaning working directory
Updating working directory
Processing patch 71204 from bug 45271.
ERROR: /Users/abarth/git/webkit-queue/WebCore/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive).

Full output: http://queues.webkit.org/results/4593003
Comment 43 Peter Rybin 2010-10-20 07:47:59 PDT
Created attachment 71294 [details]
Patch
Comment 44 WebKit Commit Bot 2010-10-20 13:07:08 PDT
Comment on attachment 71294 [details]
Patch

Clearing flags on attachment: 71294

Committed r70165: <http://trac.webkit.org/changeset/70165>
Comment 45 WebKit Commit Bot 2010-10-20 13:07:20 PDT
All reviewed patches have been landed.  Closing bug.
Comment 46 James Robinson 2010-10-20 13:24:45 PDT
It appears this patch broke the apple windows bots.  Are you working on fixing them?  If not this patch will probably be rolled out.
Comment 47 WebKit Review Bot 2010-10-20 13:27:18 PDT
http://trac.webkit.org/changeset/70165 might have broken Qt Linux Release
Comment 48 Csaba Osztrogonác 2010-10-20 15:11:45 PDT
(In reply to comment #47)
> http://trac.webkit.org/changeset/70165 might have broken Qt Linux Release

Reopen, because it was rolled out by http://trac.webkit.org/changeset/70181
Comment 49 Csaba Osztrogonác 2010-10-20 15:12:55 PDT
(In reply to comment #48)
> (In reply to comment #47)
> > http://trac.webkit.org/changeset/70165 might have broken Qt Linux Release
> 
> Reopen, because it was rolled out by http://trac.webkit.org/changeset/70181

http://build.webkit.org/results/Qt%20Linux%20Release/r70166%20(22299)
Comment 50 Peter Rybin 2010-10-23 15:02:38 PDT
Created attachment 71655 [details]
Patch
Comment 51 Peter Rybin 2010-10-23 15:09:18 PDT
(In reply to comment #50)
> Created an attachment (id=71655) [details]
> Patch

This patch updates XML parser for QT. It fixes following tests:
LayoutTests/fast/xpath/nsresolver-exception.xhtml
LayoutTests/fast/xsl/transform-xhr-doc.xhtml
LayoutTests/svg/custom/use-nested-missing-target-removed.svg

(I guess that's all tests that previous patch broke).
Comment 52 Adam Barth 2010-10-23 17:21:18 PDT
Comment on attachment 71655 [details]
Patch

Thanks for taking another run at this patch.
Comment 53 WebKit Review Bot 2010-10-25 08:06:34 PDT
http://trac.webkit.org/changeset/70451 might have broken Qt Linux Release
The following tests are not passing:
fast/frames/set-unloaded-frame-location.html
Comment 54 Csaba Osztrogonác 2010-10-25 09:27:27 PDT
(In reply to comment #53)
> http://trac.webkit.org/changeset/70451 might have broken Qt Linux Release
> The following tests are not passing:
> fast/frames/set-unloaded-frame-location.html

Any progression with fix?
Comment 55 Peter Rybin 2010-10-25 09:37:20 PDT
(In reply to comment #54)
> (In reply to comment #53)
> > http://trac.webkit.org/changeset/70451 might have broken Qt Linux Release
> > The following tests are not passing:
> > fast/frames/set-unloaded-frame-location.html
> 
> Any progression with fix?

Unfortunately I'm still working on reproducing it.
We agreed with caseq@ about rolling it back soon.
Comment 56 Peter Rybin 2010-11-10 05:12:45 PST
Created attachment 73489 [details]
Patch to land
Comment 57 Andrey Kosyakov 2010-11-10 06:48:51 PST
(In reply to comment #56)
> Created an attachment (id=73489) [details]

Manually committed r71735: http://trac.webkit.org/changeset/71735
Comment 58 Adam Barth 2010-11-10 10:31:07 PST
Yay!  Glad this finally landed.  Thanks for sticking with it.
Comment 59 Peter Rybin 2010-11-10 12:23:36 PST
(In reply to comment #58)
> Yay!  Glad this finally landed.  Thanks for sticking with it.

Thank you for support. I hope next steps I will do faster.