Bug 195642 - Web Inspector: Network - HAR Import
Summary: Web Inspector: Network - HAR Import
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Inspector (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Joseph Pecoraro
URL:
Keywords: InRadar
Depends on:
Blocks: 195709
  Show dependency treegraph
 
Reported: 2019-03-12 14:47 PDT by Joseph Pecoraro
Modified: 2019-03-15 13:27 PDT (History)
5 users (show)

See Also:


Attachments
[PATCH] Proposed Fix (248.12 KB, patch)
2019-03-12 14:55 PDT, Joseph Pecoraro
no flags Details | Formatted Diff | Diff
[IMAGE] Real Data (1.34 MB, image/png)
2019-03-12 15:01 PDT, Joseph Pecoraro
no flags Details
[IMAGE] Imported Data (1.24 MB, image/png)
2019-03-12 15:01 PDT, Joseph Pecoraro
no flags Details
[HAR] Test Archive (1.39 MB, text/plain)
2019-03-12 15:02 PDT, Joseph Pecoraro
no flags Details
[PATCH] Proposed Fix (249.39 KB, patch)
2019-03-13 14:03 PDT, Joseph Pecoraro
hi: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Joseph Pecoraro 2019-03-12 14:47:18 PDT
Network - HAR Import

Import a HAR.
Comment 1 Joseph Pecoraro 2019-03-12 14:53:49 PDT
<rdar://problem/34820974>
Comment 2 Joseph Pecoraro 2019-03-12 14:55:19 PDT
Created attachment 364450 [details]
[PATCH] Proposed Fix
Comment 3 EWS Watchlist 2019-03-12 14:58:34 PDT
Attachment 364450 [details] did not pass style-queue:


ERROR: Source/WebInspectorUI/ChangeLog:9:  Need whitespace between colon and description  [changelog/filechangedescriptionwhitespace] [5]
Total errors found: 1 in 15 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Joseph Pecoraro 2019-03-12 15:01:33 PDT
Comment on attachment 364450 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=364450&action=review

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:121
> +        if (!isNaN(requestSentWalltime) && !isNaN(archiveStartWalltime)) {

Something in this may be slightly off given the attached screenshots which show the imported showing wrong times. Though it is is also possible that maybe our export is slightly off. Needs investigation.
Comment 5 Joseph Pecoraro 2019-03-12 15:01:49 PDT
Created attachment 364453 [details]
[IMAGE] Real Data
Comment 6 Joseph Pecoraro 2019-03-12 15:01:59 PDT
Created attachment 364454 [details]
[IMAGE] Imported Data
Comment 7 Joseph Pecoraro 2019-03-12 15:02:11 PDT
Created attachment 364455 [details]
[HAR] Test Archive
Comment 8 Joseph Pecoraro 2019-03-12 17:35:42 PDT
Comment on attachment 364450 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=364450&action=review

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:121
>> +        if (!isNaN(requestSentWalltime) && !isNaN(archiveStartWalltime)) {
> 
> Something in this may be slightly off given the attached screenshots which show the imported showing wrong times. Though it is is also possible that maybe our export is slightly off. Needs investigation.

Turns out this was an export bug fixed in:
<rdar://problem/48831152> Web Inspector: Network - HAR Export duplicates blocked/send time if there was no dns/connect block (195655)

I believe the import code is correct.
Comment 9 Devin Rousso 2019-03-13 10:18:04 PDT
Comment on attachment 364450 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=364450&action=review

> Source/WebInspectorUI/ChangeLog:43
> +        Add an import button. When an import succeeds reset the
> +        table and only show imported resources (ignoring page
> +        loaded resources).

I don't think this is the right approach for this.  Hiding the current page's network data is  bound to cause confusion.   Especially since it's not indicated anywhere.  Perhaps we can add a <select> next to the "[ ] Preserve Log" that would appear when there are imported recordings?  That way, it's possible to toggle back and forth.

> Source/WebInspectorUI/UserInterface/Controllers/HARBuilder.js:335
> +            return null;

Should this be an "error" case?

> Source/WebInspectorUI/UserInterface/Controllers/HARBuilder.js:350
> +        return WI.Resource.ResponseSource.Other;

I never understood the "distinction" in our style between using a `default` vs returning outside the `switch`.   Do you have any "criteria" for using one over the other?

> Source/WebInspectorUI/UserInterface/Controllers/NetworkManager.js:46
> +        this._localResourcesMap = new Map;

This assumes that there is only ever one `WI.LocalResource` per url.  Does HAR explicitly only allow one resource per URL?  This would be a good opportunity to use a `Multimap` :)

> Source/WebInspectorUI/UserInterface/Controllers/NetworkManager.js:610
> +    localResourceForURL(url)

Will this be used by a later patch?

> Source/WebInspectorUI/UserInterface/Controllers/NetworkManager.js:663
> +        let mainResourceSentWalltime = WI.HARBuilder.dateFromHARDate(json.log.pages[0].startedDateTime) / 1000;

We should probably have stricter checks since this is imported data (which could be potentially anything).  Nothing will get executed, but I'd like to avoid situations where someone can load something into Web Inspector and cause it to error :(
```
    if (!Array.isArray(json.log.entries) || !Array.isArray(json.log.pages) || !json.log.pages[0] || !json.log.pages[0].startedDateTime) {
```
If you're feeling _really_ careful, we should also check that `mainResourceSentWalltime` doesn't return `NaN` :)

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:50
> +        // NOTE: This directly overwrites WI.Resource properties.

Rather than do this, can't we pass some of these values into the constructor as an object?  The constructor of `WI.Resource` (and `WI.SourceCode`) don't do any other logic that isn't repeated here.  Is this a "future-proofing" of sorts, to make sure that if the constructor of `WI.Resource` changes this class won't also be affected?
```
    super(request.url, {
        mimeType: response.mimeType || this._responseHeaders.valueForCaseInsensitiveKey("Content-Type") || null,
        requestMethod: request.method,
        requestHeaders:  request.headers,
        requestData: request.data,
        requestSentTimestamp, request.timestamp,
        requestSentWalltime: request.walltime,
    });
```

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:84
> +        this._failed = false; // FIXME: How should we denote a failure? Assume from status code / failure reason?
> +        this._cached = false; // FIXME: How should we denote cached? Assume from response source?

Is this not stored in the HAR?  We already add "extra" keys/values (e.g. `_fetchType`), so  maybe we could do the same for `_failed` and  `_cached`?

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:94
> +            for (let {name, value} of headers)
> +                result[name] = value;

This limits us to having only one header per "type", although it seems like this is expected by `WI.Resource`.

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:102
> +        // FIXME: HAR serverIPAddress lacks the port.
> +        // FIXME: HAR does not include resource priority.

Ditto (>83).

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:126
> +            let earlierToRequestStart = timings.send / 1000;

NIT: how about `beforeRequestStart`, or even `priorToRequestStart`?

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:130
> +            // We don't allow zero for the start time, so bump it to non-zero if it was.

Is it possible for an inspector generated HAR to have a 0 `requestSentTimestamp`?  If not, I'd consider this to be an early-return worthy error (see >NetworkManager.js:663).

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:157
> +            if (connectDuration) {

Why is it that `connectStart` is only being set if `blockedDuration` and/or `dnsDuration` is set, but not if just `connectDuration` is set?  If  `blockedDuration` and `dnsDuration` are both 0, but `connectDuration` is non-0, shouldn't we still be setting `connectStart = accumulation` and `connectEnd = accumulation + connectDuration`?

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:232
> +        // If that is the case undecode them here and treat as text.

"undecode"?

> Source/WebInspectorUI/UserInterface/Views/NetworkTableContentView.js:115
> +        this._harImportNavigationItem.addEventListener(WI.ButtonNavigationItem.Event.Clicked, () => { this._importHAR(); });

NIT: please put this on a new line, so it isn't confused with an implicit return arrow function.
Comment 10 Joseph Pecoraro 2019-03-13 13:53:32 PDT
Comment on attachment 364450 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=364450&action=review

>> Source/WebInspectorUI/ChangeLog:43
>> +        loaded resources).
> 
> I don't think this is the right approach for this.  Hiding the current page's network data is  bound to cause confusion.   Especially since it's not indicated anywhere.  Perhaps we can add a <select> next to the "[ ] Preserve Log" that would appear when there are imported recordings?  That way, it's possible to toggle back and forth.

The two cases are mutually exclusive:
• If you're importing a HAR you probably don't care about the active page's data.
• If you care about the active page's data you wouldn't import a HAR.

However I agree we should have some indication somewhere that we are viewing an imported archive.

>> Source/WebInspectorUI/UserInterface/Controllers/HARBuilder.js:350
>> +        return WI.Resource.ResponseSource.Other;
> 
> I never understood the "distinction" in our style between using a `default` vs returning outside the `switch`.   Do you have any "criteria" for using one over the other?

Yeahhh we don't really have one. I'll just make them consistent in this file. Ideally we wouldn't have a default and could lint that the switch handles all cases.

Also in this case I'll make then `console.warn` since it is user provided data, we might be missing an unknown valid case or  it could just be bad data.

>> Source/WebInspectorUI/UserInterface/Controllers/NetworkManager.js:46
>> +        this._localResourcesMap = new Map;
> 
> This assumes that there is only ever one `WI.LocalResource` per url.  Does HAR explicitly only allow one resource per URL?  This would be a good opportunity to use a `Multimap` :)

Our entire Network stack makes that "assumption" right now. Technically nothing uses this map right now anyways, I used it for testing.

No HAR doesn't restrict one resource per URL. Also the same HAR could be imported multiple times.

>> Source/WebInspectorUI/UserInterface/Controllers/NetworkManager.js:610
>> +    localResourceForURL(url)
> 
> Will this be used by a later patch?

No. I used this for debugging and in case we start using this in the future.

>> Source/WebInspectorUI/UserInterface/Controllers/NetworkManager.js:663
>> +        let mainResourceSentWalltime = WI.HARBuilder.dateFromHARDate(json.log.pages[0].startedDateTime) / 1000;
> 
> We should probably have stricter checks since this is imported data (which could be potentially anything).  Nothing will get executed, but I'd like to avoid situations where someone can load something into Web Inspector and cause it to error :(
> ```
>     if (!Array.isArray(json.log.entries) || !Array.isArray(json.log.pages) || !json.log.pages[0] || !json.log.pages[0].startedDateTime) {
> ```
> If you're feeling _really_ careful, we should also check that `mainResourceSentWalltime` doesn't return `NaN` :)

Done.

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:50
>> +        // NOTE: This directly overwrites WI.Resource properties.
> 
> Rather than do this, can't we pass some of these values into the constructor as an object?  The constructor of `WI.Resource` (and `WI.SourceCode`) don't do any other logic that isn't repeated here.  Is this a "future-proofing" of sorts, to make sure that if the constructor of `WI.Resource` changes this class won't also be affected?
> ```
>     super(request.url, {
>         mimeType: response.mimeType || this._responseHeaders.valueForCaseInsensitiveKey("Content-Type") || null,
>         requestMethod: request.method,
>         requestHeaders:  request.headers,
>         requestData: request.data,
>         requestSentTimestamp, request.timestamp,
>         requestSentWalltime: request.walltime,
>     });
> ```

Sure, I did this. It does seem a little weird, but does reduce duplicate work.

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:84
>> +        this._cached = false; // FIXME: How should we denote cached? Assume from response source?
> 
> Is this not stored in the HAR?  We already add "extra" keys/values (e.g. `_fetchType`), so  maybe we could do the same for `_failed` and  `_cached`?

Correct, but I want this to work with existing HARs today and then we can potentially extend our HAR to include additional properties.

There is a `cache` section of HAR that we do not implement.

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:94
>> +                result[name] = value;
> 
> This limits us to having only one header per "type", although it seems like this is expected by `WI.Resource`.

Yep, we've had this limitation for a very long time.

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:102
>> +        // FIXME: HAR does not include resource priority.
> 
> Ditto (>83).

Same response. I want this to work with HARs today, and extend with WebKit specific values later.

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:126
>> +            let earlierToRequestStart = timings.send / 1000;
> 
> NIT: how about `beforeRequestStart`, or even `priorToRequestStart`?

Prior sounds nice!

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:130
>> +            // We don't allow zero for the start time, so bump it to non-zero if it was.
> 
> Is it possible for an inspector generated HAR to have a 0 `requestSentTimestamp`?  If not, I'd consider this to be an early-return worthy error (see >NetworkManager.js:663).

Yes, it is possible, and likely always the case for the for the initial main resource, since that will be the pages[0].startedDateTime that we are using as the `archiveStartWalltime`:

    startedDateTime: HARBuilder.date(WI.networkManager.mainFrame.mainResource.requestSentDate),

It is just unfortunate that we don't support timing.startTime being the number 0, and I'll file a bug about that.

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:157
>> +            if (connectDuration) {
> 
> Why is it that `connectStart` is only being set if `blockedDuration` and/or `dnsDuration` is set, but not if just `connectDuration` is set?  If  `blockedDuration` and `dnsDuration` are both 0, but `connectDuration` is non-0, shouldn't we still be setting `connectStart = accumulation` and `connectEnd = accumulation + connectDuration`?

The 4 cases are (with any section prior to `send` being potentially -1 or 0).

     |-------------------------------------|-------|
       blocked                               send

     |-------------------------|-----------|-------|
       blocked                   connect     send

     |-----------------|-------|-----------|-------|
       blocked           dns     connect     send

     |-----------------|-------------------|-------|
       blocked           dns                 send

I've simplified this to instead just perform accumulation work:

    let accumulation = timing.startTime;

    if (hasBlocked)
        accumulation += (timings.blocked / 1000);

    if (hasDNS) {
        timing.domainLookupStart = accumulation;
        accumulation += (timings.dns / 1000);
        timing.domainLookupEnd = accumulation;
    }

    if (hasConnect) {
        timing.connectStart = accumulation;
        accumulation += (timings.connect / 1000);
        timing.connectEnd = accumulation;

        if (hasSecureConnect)
            timing.secureConnectionStart = timing.connectEnd - (timings.ssl / 1000);
    }

    accumulation += priorToRequestStart;
    timing.requestStart = accumulation;

    ...

This should better handle cases where a prior section was missing or empty and make it more straightforward to read as you were suggesting.

>> Source/WebInspectorUI/UserInterface/Views/NetworkTableContentView.js:115
>> +        this._harImportNavigationItem.addEventListener(WI.ButtonNavigationItem.Event.Clicked, () => { this._importHAR(); });
> 
> NIT: please put this on a new line, so it isn't confused with an implicit return arrow function.

We follow this pattern in more than 50% of these, but I'll newline it.
Comment 11 Joseph Pecoraro 2019-03-13 14:03:00 PDT
Created attachment 364570 [details]
[PATCH] Proposed Fix
Comment 12 EWS Watchlist 2019-03-13 14:04:23 PDT
Attachment 364570 [details] did not pass style-queue:


ERROR: Source/WebInspectorUI/ChangeLog:9:  Need whitespace between colon and description  [changelog/filechangedescriptionwhitespace] [5]
Total errors found: 1 in 15 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 13 Joseph Pecoraro 2019-03-13 21:11:10 PDT
Comment on attachment 364450 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=364450&action=review

>>> Source/WebInspectorUI/ChangeLog:43
>>> +        loaded resources).
>> 
>> I don't think this is the right approach for this.  Hiding the current page's network data is  bound to cause confusion.   Especially since it's not indicated anywhere.  Perhaps we can add a <select> next to the "[ ] Preserve Log" that would appear when there are imported recordings?  That way, it's possible to toggle back and forth.
> 
> The two cases are mutually exclusive:
> • If you're importing a HAR you probably don't care about the active page's data.
> • If you care about the active page's data you wouldn't import a HAR.
> 
> However I agree we should have some indication somewhere that we are viewing an imported archive.

The UI you were thinking of can be separate from this. I've put a patch up for that over on:
<https://webkit.org/b/195734> Web Inspector: Network - Toggle Between Live Activity and Imported HAR resource collections
Comment 14 Devin Rousso 2019-03-13 23:00:38 PDT
Comment on attachment 364570 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=364570&action=review

r=me

> LayoutTests/http/tests/inspector/network/har/resources/empty.har:10
> +        "startedDateTime": "2017-10-23T01:55:52.694Z",

Why was this changed from the previous patch. Does it have an effect on tests?

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:76
> +        this._type = WI.Resource.typeFromMIMEType(this._mimeType);

This should get set by the `super` call, assuming you've passed in `mimeType`.

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:100
> +    static fromHAREntry(entry, archiveStartWalltime = NaN)

Having this default to `NaN` means that it will force `requestSentTimestamp` to be `NaN` even if `requestSentWalltime` is valid.  I'd either expect an error (and `return null;`) in that case or for us to either require `archiveStartWalltime` to not be `NaN` (and error if it is) or for it to default to `0` when not provided.

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:108
> +        let {request, response, startedDateTime, timings} = entry;
> +        let requestSentWalltime = WI.HARBuilder.dateFromHARDate(startedDateTime) / 1000;
> +        let requestSentTimestamp = requestSentWalltime - archiveStartWalltime;

Should we add checks/logging/returns if any of these values isn't property formatted in the imported JSON?  I think we should have a followup to examine the entire import "flow" and add checks with errors/bails where needed to ensure that we never throw an exception.

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:131
> +            let priorToRequestStart = timings.send / 1000;
> +            let requestStartToResponseStart = timings.wait / 1000;
> +            let responseStartToResponseEnd = timings.receive / 1000;

I'd inline these since they are only used once.

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:140
> +                accumulation += (timings.blocked / 1000);

Style: please remove the parenthesis, as they are unnecessary.

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:144
> +                accumulation += (timings.dns / 1000);

Ditto (>140).

> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:150
> +                accumulation += (timings.connect / 1000);

Ditto (>140).
Comment 15 Joseph Pecoraro 2019-03-14 00:58:08 PDT
Comment on attachment 364570 [details]
[PATCH] Proposed Fix

View in context: https://bugs.webkit.org/attachment.cgi?id=364570&action=review

>> LayoutTests/http/tests/inspector/network/har/resources/empty.har:10
>> +        "startedDateTime": "2017-10-23T01:55:52.694Z",
> 
> Why was this changed from the previous patch. Does it have an effect on tests?

Yes, this is a required attribute in a valid HAR and one that HAR import now requires be parsed to non-NaN when considering if this is likely to be a valid HAR or not.

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:76
>> +        this._type = WI.Resource.typeFromMIMEType(this._mimeType);
> 
> This should get set by the `super` call, assuming you've passed in `mimeType`.

Oops yes thanks!

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:100
>> +    static fromHAREntry(entry, archiveStartWalltime = NaN)
> 
> Having this default to `NaN` means that it will force `requestSentTimestamp` to be `NaN` even if `requestSentWalltime` is valid.  I'd either expect an error (and `return null;`) in that case or for us to either require `archiveStartWalltime` to not be `NaN` (and error if it is) or for it to default to `0` when not provided.

Yes, and that is the intent. NaNs are good here, that means we did not know / do not know what it should be. That said, we will never have NaN on HAR import, and won't on Timeline Import, so I'll drop this default value and instead change it to be an assertion.

Some notes anyways.

  • walltimes are seconds from epoch (with decimals).
  • timestamps are inspector stopwatch times, so typically "seconds from inspector opening (with decimals)."
     - obviously this doesn't make sense now, so archiveStartWalltime essentially becomes the zero time for these

If we have a set of 5 entries we will want to have some base time (archiveStartWalltime) to compare them relatively in time. We could find the the earlier possible value and use that (which is what the HAR import should be doing be getting the pages[0].startedDateTime). In Timeline Import we will use a startTime that was serialized into the recording data at Timeline export time, which is pretty much just made up. But it makes sense for that base number to come from outside of this method. I'm not sure it makes sense to assume a zero though, as then entries would appear to overlap in "timestamp" time but be very different in "walltime" time.

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:108
>> +        let requestSentTimestamp = requestSentWalltime - archiveStartWalltime;
> 
> Should we add checks/logging/returns if any of these values isn't property formatted in the imported JSON?  I think we should have a followup to examine the entire import "flow" and add checks with errors/bails where needed to ensure that we never throw an exception.

Any math operations for timing should produce NaNs or Infinities and not be a problem. The intent was that this would produce NaNs and continue forward.

However there is certainly a possibility that the types down below that feed into `request` and `response` could have better type hardening.

>> Source/WebInspectorUI/UserInterface/Models/LocalResource.js:140
>> +                accumulation += (timings.blocked / 1000);
> 
> Style: please remove the parenthesis, as they are unnecessary.

I find it make these lines easier to read.
Comment 16 Joseph Pecoraro 2019-03-15 13:27:14 PDT
https://trac.webkit.org/changeset/242948/webkit