Bug 31538 - Web Inspector: Wrong console output for Regexp escape sequence
Summary: Web Inspector: Wrong console output for Regexp escape sequence
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Inspector (Deprecated) (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Joseph Pecoraro
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-11-16 02:16 PST by Keishi Hattori
Modified: 2009-11-30 13:46 PST (History)
7 users (show)

See Also:


Attachments
example of wrong output (7.07 KB, image/png)
2009-11-16 02:16 PST, Keishi Hattori
no flags Details
[PATCH] Half Way There (8.04 KB, patch)
2009-11-21 12:05 PST, Joseph Pecoraro
no flags Details | Formatted Diff | Diff
[PATCH] Desired Behavior (16.24 KB, patch)
2009-11-24 21:32 PST, Joseph Pecoraro
pfeldman: review-
Details | Formatted Diff | Diff
[PATCH] Desired Behavior + Some Refactoring (24.80 KB, patch)
2009-11-25 22:29 PST, Joseph Pecoraro
no flags Details | Formatted Diff | Diff
[PATCH] Desired Behavior + Some Refactoring (24.89 KB, patch)
2009-11-25 22:53 PST, Joseph Pecoraro
pfeldman: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Keishi Hattori 2009-11-16 02:16:24 PST
Created attachment 43282 [details]
example of wrong output

console output for Regexp escapes the backslash for escape sequence
Comment 1 Joseph Pecoraro 2009-11-21 12:05:17 PST
Created attachment 43654 [details]
[PATCH] Half Way There

This isn't a land-able patch, it is only for discussion. Its likely we should refactor some areas of format printing.

For this bug, regexes where using _formatvalue and not _formatregex because it was treating Proxied regexs as undecorated.  Also, _formatregex was actually adding backslashes, not removing any!  The behavior after this patch is:
http://grab.by/Iwn

Note that there is a difference in string formatting. Is this the desired behavior (different in multiple cases):

  > "test" // Case 1
  test

  > console.log("%s", "test")
  test

  > ["test"] // Case 2
  ["test"]

  > ({a:"test"})
  a : "hi"
Comment 2 Joseph Pecoraro 2009-11-21 12:06:30 PST
>   > ({a:"test"})
>   a : "hi"

Hehe, sorry about that typo:

  > ({a:"test"})
  a: "test"
Comment 3 Timothy Hatcher 2009-11-21 17:08:03 PST
(In reply to comment #1)
> Note that there is a difference in string formatting. Is this the desired
> behavior (different in multiple cases):
> 
>   > "test" // Case 1
>   test
> 
>   > console.log("%s", "test")
>   test

We should quote the string in the first case, but not for console.log.

>   > ["test"] // Case 2
>   ["test"]

Looks fine.

>   > ({a:"test"})
>   a : "test"

This should have braces.
Comment 4 Joseph Pecoraro 2009-11-24 21:32:44 PST
Created attachment 43826 [details]
[PATCH] Desired Behavior

I spent some time trying to diagnose the issues with formatting. And write them down so I remember

Output of Single Values:

  - strings  => "quoted"
  - number   => number
  - regex    => /form/
  - date     => date string
  - function => string
  - node     => html tree
  - error    => red error message, no quotes
  - array + object = special behavior


Limitations Depending on how they are Output:

  - typing a literal/variable in the console
    - functions should output fully (multiline)
    - string should output fully

  - a value containing in an array
    - functions should output abbreviated (one line)
    - strings should output abbreviated (100 chars)

  - a value of an object property
    - functions should output abbreviated (one line)
    - strings should output abbreviated (100 chars)

  - output from console.log() and family (except dir)
    - strings should NOT be quoted

  - output from the background should NOT be quoted
    - XHR Responses
    - Profile Finished


What should Probably be Linkified:

  - strings, function strings, node contents
  - regexs?  


All in all, I think my patch brings us to this behavior, however testing it turned out to be harder then I thought. At some point the test results had quotes which my manual tests did not show. My tests try three things:

  1. straight console.log
  2. console.log in an array
  3. variable execution in the console


If someone can find the time to refactor, areas to consider:

  - InjectedScript side
    - Object.describe takes an "abbreviate" flag
    - Object.type determines the type used later in _format(s)

  - Inspector side
    - entry point for console.log messages?
    - entry point for profile finished messages?
    - other generic messages like tips etc?
    - WebInspector.ConsoleView.prototype._format can be simplified
    - WebInspector.ConsoleMessage._format is complex, calls the above
  
  - Other
    - (new RegExp(...)).toString() doesn't handle "/" well.
Comment 5 Pavel Feldman 2009-11-24 23:57:09 PST
Comment on attachment 43826 [details]
[PATCH] Desired Behavior

> +CONSOLE MESSAGE: line 49: Sat Nov 21 2009 14:25:45 GMT-0500 (EST)
> +CONSOLE MESSAGE: line 50: Sat Nov 21 2009 14:25:45 GMT-0500 (EST)

Not everybody is on EST, bots will probably fail on this one. (r- due to this).

> +console-format.html:49Sat Nov 21 2009 14:25:45 GMT-0500 (EST)
> +console-format.html:50[Sat Nov 21 2009 14:25:45 GMT-0500 (EST)]
> +"Sat Nov 21 2009 14:25:45 GMT-0500 (EST)"

Ditto.

> +var node = null; // NOTE: This is filled in after the page has loaded in doit();

Why not to push everything in doit, leaving only var globals = [] here?

> +function loopOverGlobals(current)

Ok, so you want to wait for both logs and eval to be formatted before you dump the next value.

> +function frontend_evaluateStringInConsole()
> +{
> +    var expression = "\"this is a test of a raw string in the console\"";

Not sure why you need to special case this - we already check it for "test". Should you change "test" to "raw string should be quoted on eval and array, not in log." or something?

> +function frontend_evalExpression(expression)

Nit: We could refactor ConsoleView.prototype._enterKeyPressed for better testability (extract what you need into public "evaluateExpressionInConsole" or something. We already have a bunch of evalIn and doEvalIn though.

> +    this.knownFormatters = {

This does not need to be public.

> +    this.undecoratedTypes = {

Ditto.

> -        if (isProxy && type !== "object" && type !== "function" && type !== "array" && type !== "node") {
> +        if (isProxy && !(type in this.knownFormatters)) {

So the idea here was that everything comes as a proxy. I think !isProxy means null only (worth checking).
Now for primitive values, we expected description to be good, so we were unwrapping them and dumping descriptions to console (via marking them as undecorated).
It is just that we did not delete primitive formatters when migrated to proxies for primitive values (i.e. formatdate, formatstring are unused on ToT).
I think what you should do is rename your knownFormatters to customFormatters and leave only object, function, array, node, string and regex there.
Everything else should fall under undecorated types - not even need to check and create enum for them - just output descriptions there.

>      _formatdate: function(date, elem)

Should not exist.

>      },

>      _formatstring: function(str, elem)
>      {
> -        elem.appendChild(document.createTextNode("\"" + str + "\""));

>      _formatregexp: function(re, elem)
>      {
> -        var formatted = String(re.description).replace(/([\\\/])/g, "\\$1").replace(/\\(\/[gim]*)$/, "$1").substring(1);
> +        var formatted = String(re.description).replace(/\\(.)/g, "$1");

It is not clear to me why this is needed. Is the one in Object.describe bad? Should it be fixed instead? That way regex won't need custom formatter.

>          for (var i = 0; i < parameters.length; ++i) {
> -            if (Object.proxyType(parameters[i]) === "string")
> -                formattedResult.appendChild(WebInspector.linkifyStringAsFragment(parameters[i].description));
> -            else
> -                formattedResult.appendChild(formatForConsole(parameters[i]));
> +            var param = parameters[i];
> +            if (Object.proxyType(param) === "string") {
> +                if (this.originatingCommand && this.level === WebInspector.ConsoleMessage.MessageLevel.Log) {
> +                    var quotedString = "\"" + param.description.replace(/"/g, "\\\"") + "\"";
> +                    formattedResult.appendChild(WebInspector.linkifyStringAsFragment(quotedString));

I guess you expect this to be evaluation result only. My question is how can something from console raw eval have parameters?
In other words, should we special case this earlier on entering _format? Does it make sense?
Comment 6 Joseph Pecoraro 2009-11-25 22:05:41 PST
Thanks for the great review! I've worked on a patch that I want to clean up a bit and post which takes many of these comments into consideration.

(In reply to comment #5)
> (From update of attachment 43826 [details])
> > +CONSOLE MESSAGE: line 49: Sat Nov 21 2009 14:25:45 GMT-0500 (EST)
> > +CONSOLE MESSAGE: line 50: Sat Nov 21 2009 14:25:45 GMT-0500 (EST)
> 
> Not everybody is on EST, bots will probably fail on this one. (r- due to this).

I don't know a solution to this =/. Anyone know how I can control how this behaves?

  (new Date).toString()

Maybe I should leave Date out.


> > +var node = null; // NOTE: This is filled in after the page has loaded in doit();
> 
> Why not to push everything in doit, leaving only var globals = [] here?

Sounds good.


> > +function loopOverGlobals(current)
> 
> Ok, so you want to wait for both logs and eval to be formatted before you dump
> the next value.

Exactly. I thought it would be best for someone using the test to see the 3 different cases right next to each other, rather then making them scroll up and down to compare.


> > +function frontend_evaluateStringInConsole()
> > +{
> > +    var expression = "\"this is a test of a raw string in the console\"";
> 
> Not sure why you need to special case this - we already check it for "test".
> Should you change "test" to "raw string should be quoted on eval and array, not
> in log." or something?

Removed. That was my original use case, you're right that it is not needed.


> > +function frontend_evalExpression(expression)
> 
> Nit: We could refactor ConsoleView.prototype._enterKeyPressed for better
> testability (extract what you need into public "evaluateExpressionInConsole" or
> something. We already have a bunch of evalIn and doEvalIn though.

This area really bothered me in debugging. There is a chain of two or three functions that do basically one thing each, and I think (I haven't looked deeper yet) that they chain always happens from the top. I might simplify this.


> > +    this.knownFormatters = {
> 
> This does not need to be public.

Changed and renamed to this._customFormatters.


> > +    this.undecoratedTypes = {
> 
> Ditto.

Removed in recent refactoring.


> 
> >      _formatdate: function(date, elem)
> 
> Should not exist.

[See Next Comment]

> 
> >      },
> 
> >      _formatstring: function(str, elem)
> >      {
> > -        elem.appendChild(document.createTextNode("\"" + str + "\""));
> 
> >      _formatregexp: function(re, elem)
> >      {
> > -        var formatted = String(re.description).replace(/([\\\/])/g, "\\$1").replace(/\\(\/[gim]*)$/, "$1").substring(1);
> > +        var formatted = String(re.description).replace(/\\(.)/g, "$1");
> 
> It is not clear to me why this is needed. Is the one in Object.describe bad?
> Should it be fixed instead? That way regex won't need custom formatter.

Yes, Object.describe for regex was using a rather complicated replace (like you see above) which didn't seem useful. I simplified it to just String(regex).

The only custom formatters that remain (after the recent refactor) are what you would expect:

  - node
  - array
  - object

Everything else just prints out the description (built from Object.describe).


> >          for (var i = 0; i < parameters.length; ++i) {
> > -            if (Object.proxyType(parameters[i]) === "string")
> > -                formattedResult.appendChild(WebInspector.linkifyStringAsFragment(parameters[i].description));
> > -            else
> > -                formattedResult.appendChild(formatForConsole(parameters[i]));
> > +            var param = parameters[i];
> > +            if (Object.proxyType(param) === "string") {
> > +                if (this.originatingCommand && this.level === WebInspector.ConsoleMessage.MessageLevel.Log) {
> > +                    var quotedString = "\"" + param.description.replace(/"/g, "\\\"") + "\"";
> > +                    formattedResult.appendChild(WebInspector.linkifyStringAsFragment(quotedString));
> 
> I guess you expect this to be evaluation result only. My question is how can
> something from console raw eval have parameters?
> In other words, should we special case this earlier on entering _format? Does
> it make sense?

You are correct (I expected a raw eval, and there should be a special case for something with parameters). I did actually add a special case for things with parameters. However, rather then discuss this my next patch has made things cleaner and more obvious.


> > -        if (isProxy && type !== "object" && type !== "function" && type !== "array" && type !== "node") {
> > +        if (isProxy && !(type in this.knownFormatters)) {
> 
> So the idea here was that everything comes as a proxy. I think !isProxy means
> null only (worth checking).

I never ran into a case where isProxy was false. However, I left that in in case in the future someone wants to use WebInspector.console._format for something else.  I suspect the reason is because nothing calls that _format without already explicitly wrapping primitives (there is commented code in the ConsoleMessage's _format that does this).

> Now for primitive values, we expected description to be good, so we were
> unwrapping them and dumping descriptions to console (via marking them as
> undecorated).
>
> It is just that we did not delete primitive formatters when migrated to proxies
> for primitive values (i.e. formatdate, formatstring are unused on ToT).
> I think what you should do is rename your knownFormatters to customFormatters
> and leave only object, function, array, node, string and regex there.
> Everything else should fall under undecorated types - not even need to check
> and create enum for them - just output descriptions there.

Correct the description is good. Thanks for that insight, this drastically simplified ConsoleView's _format code.
Comment 7 Joseph Pecoraro 2009-11-25 22:29:27 PST
Created attachment 43895 [details]
[PATCH] Desired Behavior + Some Refactoring

This is another step in the right direction.  This patch includes:

  - a bunch of style fixes / improvements to ConsoleView
  - simplification and clarification of the current _format flow
  - the same test as before but without the Date test

Because the patch is difficult to read. Here is a better view of the reformulated _format functions:

  - ConsoleView's _format (dispatch to formatter):
    http://pastie.textmate.org/private/celurrgbupm5jvxu2hg

  - ConsoleMessage's _format (workflow) => 
    http://pastie.textmate.org/private/i0o8w6uuj2khupo4bwdrkg

Issues I have:

  - test output of linkify doesn't show that the output is a link? I may have to use something other then dumpMessages
  - test output contains quotes around text that doesn't appear in manual console tests.
  - again, I cannot find a case for _formaterror and it was removed. Could it happen?

Next Steps:

  - remove the hack I made checking for this.originatingCommand. Meaning we should separate handling of user eval'd commands in a better way or provide a more concerete way to check for this. (Maybe yet another parameter in the constructor?)
Comment 8 Joseph Pecoraro 2009-11-25 22:53:41 PST
Created attachment 43896 [details]
[PATCH] Desired Behavior + Some Refactoring

Fixed some style issues I had.
Comment 9 Timothy Hatcher 2009-11-25 23:00:56 PST
_formaterror could be triggered by:

try {
    window.foo.x = 2;
} catch (e) {
    console.log(e);
}
Comment 10 Timothy Hatcher 2009-11-25 23:02:00 PST
But mybe you should just use console.error in that case to get red?
Comment 11 Pavel Feldman 2009-11-26 05:59:34 PST
Comment on attachment 43896 [details]
[PATCH] Desired Behavior + Some Refactoring

This looks right, although lots of changes. But I think you covered pretty much everything with the tests, so we are safe!

> +    globals = [regex1, regex2, str, error, node, func, multilinefunc, num, linkify, undefined];

Could you please add null to the list?

> +        for (var i = 0; i < parameters.length; ++i)
>              if (typeof parameters[i] !== "object" && typeof parameters[i] !== "function")
>                  parameters[i] = WebInspector.ObjectProxy.wrapPrimitiveValue(parameters[i]);
> -        }

I think we use {} for more-than-one-line blocks even if that is single statement. No?
Comment 12 Joseph Pecoraro 2009-11-26 21:29:27 PST
> > +    globals = [regex1, regex2, str, error, node, func, multilinefunc, num, linkify, undefined];
> 
> Could you please add null to the list?

Sure.


> > +        for (var i = 0; i < parameters.length; ++i)
> >              if (typeof parameters[i] !== "object" && typeof parameters[i] !== "function")
> >                  parameters[i] = WebInspector.ObjectProxy.wrapPrimitiveValue(parameters[i]);
> > -        }
> 
> I think we use {} for more-than-one-line blocks even if that is single
> statement. No?

I've seen cases where there aren't braces, including Node.prototype.enclosingNodeOrSelfWithNodeNameInArray which is four layers deep. Timothy, this would be a good point for the style guide.
Comment 13 Joseph Pecoraro 2009-11-30 13:46:16 PST
Landed in http://trac.webkit.org/changeset/51506
r51506 = 723f16eb9a6dcad881b81de748e9a8c1b68c3062

Closing this bug because it solved the original bug (and more). Due to the comments keep this bug in mind for future discussion on console logging behavior.