Bug 73980 - webkit.py gdb pretty printer should consider 8-bit characters of StringImpl.
Summary: webkit.py gdb pretty printer should consider 8-bit characters of StringImpl.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Hayato Ito
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-06 21:04 PST by Hayato Ito
Modified: 2011-12-08 19:24 PST (History)
3 users (show)

See Also:


Attachments
pretty printers for 8-bit StringImpl and LChar* (6.70 KB, patch)
2011-12-06 21:13 PST, Hayato Ito
no flags Details | Formatted Diff | Diff
rebased (7.26 KB, patch)
2011-12-07 18:24 PST, Hayato Ito
no flags Details | Formatted Diff | Diff
make it utf-8 string (7.07 KB, patch)
2011-12-07 19:22 PST, Hayato Ito
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Hayato Ito 2011-12-06 21:04:58 PST
A webkit.py gdb pretty printer always uses StringImpl.m_data16 as data buffer. This is wrong assumption and pretty printer for WTFString seems broken.
We should take care of 8bit data buffers, m_data8, as well as m_data16.
Comment 1 Hayato Ito 2011-12-06 21:13:11 PST
Created attachment 118166 [details]
pretty printers for 8-bit StringImpl and LChar*
Comment 2 Tony Chang 2011-12-07 09:54:33 PST
Comment on attachment 118166 [details]
pretty printers for 8-bit StringImpl and LChar*

Can you rebase your change? https://bugs.webkit.org/show_bug.cgi?id=73878 added similar support, but it could be extended further to decode the utf8 and handle LChar*.
Comment 3 Hayato Ito 2011-12-07 17:36:35 PST
Ops. I didn't notice there was a similar unlanded patch. Let me rebase/merge two patches and upload the result.

(In reply to comment #2)
> (From update of attachment 118166 [details])
> Can you rebase your change? https://bugs.webkit.org/show_bug.cgi?id=73878 added similar support, but it could be extended further to decode the utf8 and handle LChar*.
Comment 4 Hayato Ito 2011-12-07 18:24:44 PST
Created attachment 118301 [details]
rebased
Comment 5 Hayato Ito 2011-12-07 18:29:28 PST
I rebased it.

I don't think that we can decode m_data8 using 'utf-16' as in https://bugs.webkit.org/show_bug.cgi?id=73878. So in this patch, I return m_data8 without any decoding. I think this is a reasonable assumption.
Comment 6 Tony Chang 2011-12-07 18:33:21 PST
Comment on attachment 118301 [details]
rebased

View in context: https://bugs.webkit.org/attachment.cgi?id=118301&action=review

> Tools/gdb/webkit.py:60
> +def ustring_to_unicode_string(ptr, length=None):
>      """Convert a pointer to UTF-16 data into a Python Unicode string.

We should try to be consistent with our use of unicode strings or utf8 strings.  Using utf8 might be easier, but I'll defer to your judgement.

> Tools/gdb/webkit.py:71
>      string = struct.pack('H' * length, *char_vals).decode('utf-16', 'replace')
> +    return string + unicode(error_message)

You could use utf8 for both of these.

> Tools/gdb/webkit.py:133
> +            return lstring_to_string(self.val['m_data8'], self.get_length())
> +        return ustring_to_unicode_string(self.val['m_data16'], self.get_length()).encode('utf-8')

And could drop the extra encode('utf-8') here.
Comment 7 Hayato Ito 2011-12-07 19:22:06 PST
Created attachment 118310 [details]
make it utf-8 string
Comment 8 Hayato Ito 2011-12-07 19:23:07 PST
Thank you for the review.

I updated the patch so that it uses Python's (non-unicode) string everywhere.

(In reply to comment #6)
> (From update of attachment 118301 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=118301&action=review
> 
> > Tools/gdb/webkit.py:60
> > +def ustring_to_unicode_string(ptr, length=None):
> >      """Convert a pointer to UTF-16 data into a Python Unicode string.
> 
> We should try to be consistent with our use of unicode strings or utf8 strings.  Using utf8 might be easier, but I'll defer to your judgement.
> 
> > Tools/gdb/webkit.py:71
> >      string = struct.pack('H' * length, *char_vals).decode('utf-16', 'replace')
> > +    return string + unicode(error_message)
> 
> You could use utf8 for both of these.
> 
> > Tools/gdb/webkit.py:133
> > +            return lstring_to_string(self.val['m_data8'], self.get_length())
> > +        return ustring_to_unicode_string(self.val['m_data16'], self.get_length()).encode('utf-8')
> 
> And could drop the extra encode('utf-8') here.
Comment 9 Tony Chang 2011-12-08 09:34:51 PST
Comment on attachment 118310 [details]
make it utf-8 string

Thanks!
Comment 10 WebKit Review Bot 2011-12-08 19:24:46 PST
Comment on attachment 118310 [details]
make it utf-8 string

Clearing flags on attachment: 118310

Committed r102420: <http://trac.webkit.org/changeset/102420>
Comment 11 WebKit Review Bot 2011-12-08 19:24:51 PST
All reviewed patches have been landed.  Closing bug.