Bug 200747

Summary: Web Inspector: have more aggressive checks for dataURLs provided to `console.screenshot`
Product: WebKit Reporter: Devin Rousso <hi>
Component: Web InspectorAssignee: Devin Rousso <hi>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, ews-watchlist, hi, inspector-bugzilla-changes, joepeck, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Bug Depends on: 199307    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
none
Patch none

Description Devin Rousso 2019-08-14 18:16:28 PDT
Attempting to load an image from a base64 dataURL is async, so we aren't able to do that in the `PageConsoleClient`, but we can at least synchronously try to decode the base64 string to see if that's even valid.
Comment 1 Devin Rousso 2019-08-14 18:18:02 PDT
Created attachment 376338 [details]
Patch
Comment 2 Joseph Pecoraro 2019-08-15 12:19:22 PDT
Comment on attachment 376338 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=376338&action=review

> LayoutTests/inspector/console/console-screenshot.html:142
> +    addTest({
> +        name: "console.screenshot.String.Base64.InvalidContent",
> +        expression: `console.screenshot("data:image/png;base64,<INVALID>")`,
> +        shouldCaptureViewport: true,
> +    });

r-, I wonder not expect if the user explicitly provides a dataURL for it to capture the viewport.
Comment 3 Joseph Pecoraro 2019-08-15 12:19:44 PDT
Comment on attachment 376338 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=376338&action=review

>> LayoutTests/inspector/console/console-screenshot.html:142
>> +    });
> 
> r-, I wonder not expect if the user explicitly provides a dataURL for it to capture the viewport.

Typo: "I would not expect"
Comment 4 Devin Rousso 2019-08-15 13:28:52 PDT
Comment on attachment 376338 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=376338&action=review

>>> LayoutTests/inspector/console/console-screenshot.html:142
>>> +    });
>> 
>> r-, I wonder not expect if the user explicitly provides a dataURL for it to capture the viewport.
> 
> Typo: "I would not expect"

In this case, the dataURL isn't valid, so I'm treating it like a regular string that's being logged (e.g. `console.screenshot("foo");`).  I suppose if the given string matches a dataURL pattern (`/^data:image\/(?:[^;]+;)+base64,`), that should be treated as an error if it's not decodable.
Comment 5 Devin Rousso 2019-08-15 14:14:12 PDT
Created attachment 376422 [details]
Patch
Comment 6 Joseph Pecoraro 2019-08-15 15:18:48 PDT
Comment on attachment 376422 [details]
Patch

Why aren't these just showing a broken image?
Comment 7 Devin Rousso 2019-08-15 16:04:56 PDT
(In reply to Joseph Pecoraro from comment #6)
> Why aren't these just showing a broken image?
I suppose we could do that, but I'd like to avoid sending dataURLs to the frontend as much as possible, as they're likely to be quite large.  For every other `possibleTarget` type (e.g. `Node`, `ImageBitmap`, etc.), we _only_ send the screenshot to the frontend (as a dataURL) once we know that the screenshot is valid.  For consistency, we probably could _always_ try to load an image with the dataURL and _always_ show broken images in the case that they're not valid, but I think a message is clearer in saying "this dataURL can't be loaded" than a broken image.

I also want to make sure that developers can still `console.screenshot("a", "b", 1, 2)` and get a screenshot of the viewport with some extra data logged.
Comment 8 Joseph Pecoraro 2019-08-16 01:14:31 PDT
(In reply to Devin Rousso from comment #7)
> (In reply to Joseph Pecoraro from comment #6)
> > Why aren't these just showing a broken image?
> I suppose we could do that, but I'd like to avoid sending dataURLs to the
> frontend as much as possible, as they're likely to be quite large.

Isn't the developer requesting this by calling `console.log(...)` or `console.screenshot(...)`? I'd rather we do what the developer requested then try to apply smarts.

> For every other `possibleTarget` type (e.g. `Node`, `ImageBitmap`, etc.), we
> _only_ send the screenshot to the frontend (as a dataURL) once we know that
> the screenshot is valid.  For consistency, we probably could _always_ try to
> load an image with the dataURL and _always_ show broken images in the case
> that they're not valid, but I think a message is clearer in saying "this
> dataURL can't be loaded" than a broken image.

Even with the patch as written we can't know if the dataURL can be loaded or not and might still get a broken image in some cases?

> I also want to make sure that developers can still `console.screenshot("a",
> "b", 1, 2)` and get a screenshot of the viewport with some extra data logged.

That sounds good. I'm not sure how that conflicts with this though.
Comment 9 Devin Rousso 2019-08-16 10:27:52 PDT
(In reply to Joseph Pecoraro from comment #8)
> (In reply to Devin Rousso from comment #7)
> > (In reply to Joseph Pecoraro from comment #6)
> > > Why aren't these just showing a broken image?
> > I suppose we could do that, but I'd like to avoid sending dataURLs to the frontend as much as possible, as they're likely to be quite large.
> 
> Isn't the developer requesting this by calling `console.log(...)` or `console.screenshot(...)`? I'd rather we do what the developer requested then try to apply smarts.
I don't see how this is "smarts".  Given our functionality for every other type of target (e.g. `Node`, `ImageBitmap`, etc.), we _never_ show a broken image, so to suddenly have a broken image for dataURL-like strings would be the thing I wouldn't expect.  All other target types will only show an image if we know the image is valid, so why shouldn't we try to do as much of that same "pre-processing/pre-testing" as possible for dataURL-like strings too?

> > For every other `possibleTarget` type (e.g. `Node`, `ImageBitmap`, etc.), we _only_ send the screenshot to the frontend (as a dataURL) once we know that the screenshot is valid.  For consistency, we probably could _always_ try to load an image with the dataURL and _always_ show broken images in the case that they're not valid, but I think a message is clearer in saying "this dataURL can't be loaded" than a broken image.
> 
> Even with the patch as written we can't know if the dataURL can be loaded or
> not and might still get a broken image in some cases?
Right, that's why I'd said "as much as possible".  Just to be clear, I'm not against the idea of always sending dataURL-like strings (see below) to the frontend to be shown as an image, I just don't want us to send _all_ strings (regardless of their dataURL-ness) to be treated this way (see below for reason why).

> > I also want to make sure that developers can still `console.screenshot("a", "b", 1, 2)` and get a screenshot of the viewport with some extra data logged.
> 
> That sounds good. I'm not sure how that conflicts with this though.
Are you suggesting that we treat _all_ strings as dataURLs, or only those that match `/^data:image\/(?:[^;]+;)+base64,`?

If the former, how would `console.screenshot("test")` work?  Right now, `console.screenshot(myObject)` would log `myObject` and then take a screenshot of the viewport.  I want developers to be able to do the same thing with non-dataURL strings (e.g. `console.screenshot("taking screenshot of viewport")`), meaning that the string target is instead used as an extra log.

If the latter, I don't have a strong argument against that, so I'm fine with us not testing the validity of dataURLs at all (other than the pattern match above) in the backend and just always sending the dataURL to the frontend.
Comment 10 Joseph Pecoraro 2019-08-16 11:57:42 PDT
> I don't see how this is "smarts".  Given our functionality for every other
> type of target (e.g. `Node`, `ImageBitmap`, etc.), we _never_ show a broken
> image, so to suddenly have a broken image for dataURL-like strings would be
> the thing I wouldn't expect.  All other target types will only show an image
> if we know the image is valid, so why shouldn't we try to do as much of that
> same "pre-processing/pre-testing" as possible for dataURL-like strings too?

That doesn't seem to be true. We show broken images for nodes that contain broken images. So this matches existing behavior.

Example:

    <img id="img" src="foo.txt">

Then:

    js> console.screenshot(document.getElementById("img"));
    [ broken image ]

Unless you've changed that?


> Are you suggesting that we treat _all_ strings as dataURLs, or only those
> that match `/^data:image\/(?:[^;]+;)+base64,`?

Not all strings. I'd say only special case strings that start with "data:", aka data URLs. Maybe "data:image/" if you want to be more specific, but I'm not sure the regex really gains us anything. You could imagine that if the user input a string that is a URL (http even) we could attempt to load it, but I think loading over a network would be ridiculous.


> If the former, how would `console.screenshot("test")` work?  Right now,
> `console.screenshot(myObject)` would log `myObject` and then take a
> screenshot of the viewport.  I want developers to be able to do the same
> thing with non-dataURL strings (e.g. `console.screenshot("taking screenshot
> of viewport")`), meaning that the string target is instead used as an extra
> log.
> 
> If the latter, I don't have a strong argument against that, so I'm fine with
> us not testing the validity of dataURLs at all (other than the pattern match
> above) in the backend and just always sending the dataURL to the frontend.

The latter is the simplest and would get developers useful functionality.
Comment 11 Joseph Pecoraro 2019-08-16 20:21:48 PDT
Comment on attachment 376422 [details]
Patch

I what we ended up agreeing on was:

  • Special case "data:" to send as an image
  • If rendering the dataURL fails (onerror) in the frontend, output an error message without a broken image icon

Frontend could avoid flashes but given this (errors) is such an unlikely case I'm not sure it matters much.
Comment 12 Devin Rousso 2019-08-17 02:04:29 PDT
Created attachment 376602 [details]
Patch
Comment 13 EWS Watchlist 2019-08-17 02:08:08 PDT Comment hidden (obsolete)
Comment 14 Joseph Pecoraro 2019-08-19 21:17:10 PDT
Comment on attachment 376602 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=376602&action=review

Awesome! r=me

> Source/WebCore/page/PageConsoleClient.cpp:389
> +            if (possibleTarget.getString(state, base64) && base64.startsWithIgnoringASCIICase("data:"_s) && base64.length() > 5) {

Is the `length() > 5` check needed?
Comment 15 WebKit Commit Bot 2019-08-19 21:22:09 PDT
Comment on attachment 376602 [details]
Patch

Clearing flags on attachment: 376602

Committed r248890: <https://trac.webkit.org/changeset/248890>
Comment 16 WebKit Commit Bot 2019-08-19 21:22:11 PDT
All reviewed patches have been landed.  Closing bug.
Comment 17 Radar WebKit Bug Importer 2019-08-19 21:23:19 PDT
<rdar://problem/54497945>
Comment 18 Devin Rousso 2019-08-19 21:35:13 PDT
Comment on attachment 376602 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=376602&action=review

>> Source/WebCore/page/PageConsoleClient.cpp:389
>> +            if (possibleTarget.getString(state, base64) && base64.startsWithIgnoringASCIICase("data:"_s) && base64.length() > 5) {
> 
> Is the `length() > 5` check needed?

This was mainly to check that the `base64` string is more than just "data:", as I could imagine that "data:" is used as a logging prefix (e.g. `console.log("data:", object1, object2);`).
Comment 19 Joseph Pecoraro 2019-08-19 21:42:09 PDT
> This was mainly to check that the `base64` string is more than just "data:",
> as I could imagine that "data:" is used as a logging prefix (e.g.
> `console.log("data:", object1, object2);`).

Good idea!