Bug 100566

Summary: JSC: a JSC printf (support for %J+s and %b)
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, fpizlo, ggaren, mhahnenberg, msaboff, oliver
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
the patch.
msaboff: review-
Updated patch with more standard format compatibility, and tests.
buildbot: commit-queue-
Fixed the build breakage. Ready for review now ... fingers crossed. =) msaboff: review+

Description Mark Lam 2012-10-26 15:17:44 PDT
Ever want to call printf() on that WTF::String?  Well now you can with VMInspector::printf(). Here's the skinny:

Added VMInspector::printf(), fprintf(), sprintf(), and snprintf().
        - %b prints ints as boolean TRUE (non-zero) or FALSE (zero).
        - %Js prints a WTF::String* like a %s prints a char*.
          Also works for 16bit WTF::Strings (prints wchar_t* using %S).
        - %J+s prints the WTF::String in verbose mode.
          The + means verbose.

Of course, they also work on any % formats that the standard ...printf() functions work on as well.
Comment 1 Mark Lam 2012-10-26 15:29:06 PDT
Created attachment 171028 [details]
the patch.
Comment 2 Michael Saboff 2012-10-30 10:48:15 PDT
Comment on attachment 171028 [details]
the patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=171028&action=review

What about tests?

Overall it looks good, but I'd like to see these comments addressed.

> Source/JavaScriptCore/ChangeLog:13
> +        - %J+s prints the WTF::String in verbose mode.
> +          The + means verbose.

Consider switching the order of these two lines.  '+' is a modifier and %J+s is an example of it's use.

> Source/JavaScriptCore/ChangeLog:16
> +        (JSC):

The prepare-ChangeLog script put this in but messed up what it is for.  Change it to the right item or remove.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:112
> +

Remove, only need one blank line between declarations.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:139
> +

Remove this empty line.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:142
> +

Remove.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:154
> +    char buffer[128];

Seems like the buffer might be too small.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:227
> +            // verbose mode prints: WTF::String "<your string>"

This comment isn't needed here since you have the same comment above printWTFString.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:237
> +            p--;

This is not good enough.  Consider the handling of the unsupported %J+z.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:251
> +        char* argFormat = buffer;

Do you need argFormat or could you just use buffer?  Using buffer could eliminate future confusion that buffer is available for use between here and the end of the while loop.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:254
> +        if (UNLIKELY(curr >= end))
> +            goto formatTooLong;

Don't think the overflow check is needed here, since curr was just set to buffer.

Also, you could create a buffer overflow macro instead of putting the if ... goto everywhere.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:263
> +        if (c == 'l' || c == 'h') {

What about hh, ll, L and the more obscure j, t and z length modifiers?

> Source/JavaScriptCore/interpreter/VMInspector.cpp:275
> +            ASSERT(curr[-1] == '\0');

Seems a little unnecessary given the assignment and if.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:295
> +    p -= 5;

Why subtract 5?  Is this to move before the current format?  Why not just use format for the error message?

> Source/JavaScriptCore/interpreter/VMInspector.cpp:302
> +    // We've got an error. Can't do any more work. Print and error message if

Should be "an" instead of "and".

> Source/JavaScriptCore/interpreter/VMInspector.cpp:357
> +            const LChar* chars = str->characters8();
> +            printArg("%s", reinterpret_cast<const char*>(chars));

Don't think you need the temp.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:360
> +            const UChar* chars = str->characters16();
> +            printArg("%S", reinterpret_cast<const wchar_t*>(chars));

Ditto.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:445
> +        int count = ::vsnprintf(m_buffer, m_size, format, args);

I don't think it matters too much, but count doesn't include the terminating '\0'.
Comment 3 Mark Lam 2012-10-30 20:57:28 PDT
Responses below.  New patch to follow shortly.

(In reply to comment #2)
> (From update of attachment 171028 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=171028&action=review
> 
> What about tests?

Tests added to Tools/TestWebKitAPI/Tests/JavaScriptCore/VMInspector.cpp.

> Overall it looks good, but I'd like to see these comments addressed.
> 
> > Source/JavaScriptCore/ChangeLog:13
> > +        - %J+s prints the WTF::String in verbose mode.
> > +          The + means verbose.
> 
> Consider switching the order of these two lines.  '+' is a modifier and %J+s is an example of it's use.

Done.

> > Source/JavaScriptCore/ChangeLog:16
> > +        (JSC):
> 
> The prepare-ChangeLog script put this in but messed up what it is for.  Change it to the right item or remove.

Deleted.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:112
> > +
> > Source/JavaScriptCore/interpreter/VMInspector.cpp:142
> > +
> 
> Remove, only need one blank line between declarations.

Per our offline discussion, we're leaving these as is for better readability since the coding-style doesn't say anything about blank lines.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:139
> > +
> 
> Remove this empty line.

Done.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:154
> > +    char buffer[128];
> 
> Seems like the buffer might be too small.

This buffer is only for:
1. To temporarily hold a copy of the normal chars (not needing formatting)  to be passed to printArg() and printed.
 
    The incoming format string may contain a string of normal chars much longer than 128, but we handle this by breaking them out to 128 chars fragments and printing each fragment before re-using the buffer to load up the next fragment.

2. To hold a single "%..." format to be passed to printArg() to process a single va_arg.

    We also verify in this case that we never copy a format string greater than 128 chars.  I'm assuming that a 128 char format string is meaningless and if the user specifies such a string, then we'll treat it as an error.

Given this usage, I felt that 128 chars would be a good buffer size without using too much stack space.  BTW, I bumped the buffer size up to 129 chars so that we can have 128 chars not including the null terminator.   I've also added the above usage details as a comment in the code.

Are you seeing a flaw somewhere in my reasoning?

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:227
> > +            // verbose mode prints: WTF::String "<your string>"
> 
> This comment isn't needed here since you have the same comment above printWTFString.

Deleted.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:237
> > +            p--;
> 
> This is not good enough.  Consider the handling of the unsupported %J+z.

Fixed.  I now cache the startOfFormatSpecifier at the beginning, and simply reset to that if we failed to parse the format.  This also takes care of future formats that can be more complex or of variable length.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:251
> > +        char* argFormat = buffer;
> 
> Do you need argFormat or could you just use buffer?  Using buffer could eliminate future confusion that buffer is available for use between here and the end of the while loop.

I wanted to use argFormat to improve readability (and conveyance of the intent of the code) and I had thought of the issues you brought up.  I've changed my mind now.  I use buffer everywhere, and added comments to indicate that buffer contains an argFormat instead.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:254
> > +        if (UNLIKELY(curr >= end))
> > +            goto formatTooLong;
> 
> Don't think the overflow check is needed here, since curr was just set to buffer.

This was needed, when I had argFormat separate from buffer, in case someone mess up argFormat in the future.  Since I got rid of argFormat, this is no longer needed, and is deleted.

> Also, you could create a buffer overflow macro instead of putting the if ... goto everywhere.

Done.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:263
> > +        if (c == 'l' || c == 'h') {
> 
> What about hh, ll, L and the more obscure j, t and z length modifiers?

I was going to say "tough luck" and not support every variation of the standard formats.  But on second thought, I decided to make at least a effort to make a good faith effort to make it more compatible (to the extent possible without rat-holing in handling exotic formats).  The code has been updated to support more formats variations.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:275
> > +            ASSERT(curr[-1] == '\0');
> 
> Seems a little unnecessary given the assignment and if.

Agreed. I was being a little too paranoid when the code was still in flux. It's unnecessary now and is deleted.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:295
> > +    p -= 5;
> 
> Why subtract 5?  Is this to move before the current format?  Why not just use format for the error message?

This was a little heuristic so that we get some context on the current format that we're attempting to parse.  Otherwise, the error message can print pass the % and be meaningless.  But now that I cache startOfFormatSpecifier, I can set p to that and be guaranteed a meaningful error message.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:302
> > +    // We've got an error. Can't do any more work. Print and error message if
> 
> Should be "an" instead of "and".

Fixed.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:357
> > +            const LChar* chars = str->characters8();
> > +            printArg("%s", reinterpret_cast<const char*>(chars));
> 
> Don't think you need the temp.

Not needed.  Just wanted it for readability and in case I want to inspect chars in gdb.  Since this doesn't bloat the code too much, I'd rather keep it.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:360
> > +            const UChar* chars = str->characters16();
> > +            printArg("%S", reinterpret_cast<const wchar_t*>(chars));
> 
> Ditto.

Ditto.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:445
> > +        int count = ::vsnprintf(m_buffer, m_size, format, args);
> 
> I don't think it matters too much, but count doesn't include the terminating '\0'.

Yes, I am aware of that and am actually counting on it here (in StringNFormatPrinter::printArg()):

        // Adjust the buffer to what's left if appropriate:
        if (success) {
            m_buffer += count;
            m_size -= count;
        }
Comment 4 Mark Lam 2012-10-30 20:59:02 PDT
Created attachment 171572 [details]
Updated patch with more standard format compatibility, and tests.
Comment 5 Build Bot 2012-10-30 21:02:12 PDT
Comment on attachment 171572 [details]
Updated patch with more standard format compatibility, and tests.

Attachment 171572 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/14687020
Comment 6 Mark Lam 2012-10-30 21:13:14 PDT
Created attachment 171573 [details]
Fixed the build breakage.  Ready for review now ... fingers crossed. =)
Comment 7 Michael Saboff 2012-10-31 17:36:04 PDT
Comment on attachment 171573 [details]
Fixed the build breakage.  Ready for review now ... fingers crossed. =)

View in context: https://bugs.webkit.org/attachment.cgi?id=171573&action=review

r=me  Consider the comments I made.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:194
> +            bool success = printArg("%s", buffer);

Since we know that buffer doesn't have any formatting, can't we just call printArg(buffer)? If so, it seems like it would be more efficient.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:281
> +        if (c == '0' || c == '-' || c == ' ' || c == '+' || c == '\'') {

What about '#' for %o, %x and the floating point formats.

> Source/JavaScriptCore/interpreter/VMInspector.cpp:311
> +        if (c == 'l' || c == 'h' || c == 'j' || c == 't' || c == 'z') {

What about L for long double?

> Source/JavaScriptCore/interpreter/VMInspector.cpp:388
> +        if (*cp == '\n')

Does this work for windows?  I thought Windows terminated lines with CR,LF.
Comment 8 Mark Lam 2012-10-31 18:01:09 PDT
(In reply to comment #7)
> (From update of attachment 171573 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=171573&action=review
> 
> r=me  Consider the comments I made.
> 
> > Source/JavaScriptCore/interpreter/VMInspector.cpp:194
> > +            bool success = printArg("%s", buffer);
> 
> Since we know that buffer doesn't have any formatting, can't we just call printArg(buffer)? If so, it seems like it would be more efficient.

Can't do this.  The string to be printed may contain strings that look like escape characters but are not.  For example, the string may have a "\n" which should be printed as a '\' followed by a 'n', but if I pass it as a format, it will be printed as a newline.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:281
> > +        if (c == '0' || c == '-' || c == ' ' || c == '+' || c == '\'') {
> 
> What about '#' for %o, %x and the floating point formats.

Added.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:311
> > +        if (c == 'l' || c == 'h' || c == 'j' || c == 't' || c == 'z') {
> 
> What about L for long double?

Added.

> > Source/JavaScriptCore/interpreter/VMInspector.cpp:388
> > +        if (*cp == '\n')
> 
> Does this work for windows?  I thought Windows terminated lines with CR,LF.

Changed to (*cp == '\n' || *cp == '\r').

All changes landed in r133103: <http://trac.webkit.org/changeset/133103>.