Bug 36794 - Teach RedirectScheduler about URLString
Summary: Teach RedirectScheduler about URLString
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-03-29 17:13 PDT by Adam Barth
Modified: 2010-06-11 12:52 PDT (History)
8 users (show)

See Also:


Attachments
Patch (49.32 KB, patch)
2010-03-29 17:31 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (51.58 KB, patch)
2010-03-29 17:32 PDT, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (51.58 KB, patch)
2010-03-29 18:13 PDT, Adam Barth
abarth: review-
abarth: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Barth 2010-03-29 17:13:32 PDT
Alexey told me on #webkit that we want to have URLString class instead of passing around URLs as Strings all the time.  It's not really feasible to convert the whole codebase at once, so I thought I'd start with the parts that touch RedirectScheduler.

My secret agenda here is to eventually attach some security bits to JavaScript URLs so we don't get more security vulnerabilities from forgetting to check access, but that will come later.
Comment 1 Adam Barth 2010-03-29 17:31:19 PDT
Created attachment 51985 [details]
Patch
Comment 2 Adam Barth 2010-03-29 17:32:31 PDT
Created attachment 51986 [details]
Patch
Comment 3 WebKit Review Bot 2010-03-29 17:40:16 PDT
Attachment 51986 [details] did not build on chromium:
Build output: http://webkit-commit-queue.appspot.com/results/1634026
Comment 4 Adam Barth 2010-03-29 18:13:32 PDT
Created attachment 51991 [details]
Patch
Comment 5 WebKit Review Bot 2010-03-29 18:19:57 PDT
Attachment 51991 [details] did not build on chromium:
Build output: http://webkit-commit-queue.appspot.com/results/1614078
Comment 6 Adam Barth 2010-03-29 18:50:01 PDT
Comment on attachment 51991 [details]
Patch

Anyway, I'll make it build before landing, obviously.
Comment 7 Darin Fisher (:fishd, Google) 2010-03-29 23:56:44 PDT
What is the advantage of adding another string type over simply using KURL in more places?

If we are concerned about the footprint of a KURL, then perhaps we could optimize it.  I can imagine it being nearly as cheap as a String object.  We could leverage the work that was done to allocate StringImpl with the UChar array, to allow us to store the additional indices held by a KURL object inline with the URL characters.  This way, we achieve a small footprint for a KURL object, and avoid having to re-parse and re-canonicalize URL strings.  We can have a simple rule that encourages use of KURL everywhere we reference absolute URLs.
Comment 8 Alexey Proskuryakov 2010-03-30 00:01:53 PDT
> If we are concerned about the footprint of a KURL, then perhaps we could
> optimize it.

Yes, that was the concern. It would be even better, if it can be done.
Comment 9 Darin Fisher (:fishd, Google) 2010-03-30 10:35:28 PDT
(In reply to comment #8)
> > If we are concerned about the footprint of a KURL, then perhaps we could
> > optimize it.
> 
> Yes, that was the concern. It would be even better, if it can be done.

It seems fairly trivial to store the indices and IsValid bit alongside the character array in memory such that a null KURL would only require sizeof(void*) memory.  We could just invent a variant of StringImpl::createUninitialized that provides for a little extra storage in the allocation.
Comment 10 Adam Barth 2010-04-02 15:58:50 PDT
Comment on attachment 51991 [details]
Patch

No love for this patch.  I can re-spin it incorporating the comments herein if folks think this is the right direction.  Otherwise, we can go more in the direction that fishd recommends.
Comment 11 Darin Adler 2010-04-02 17:16:49 PDT
(In reply to comment #10)
> Otherwise, we can go more in the
> direction that fishd recommends.

We should find out how practical his idea is.

If there is already an existing String or AtomicString, the converting it to a KURL would presumably require copying it even if we do squirrel the URL bits away inside the string buffer. But generally speaking if we can make a KURL almost the same cost as a string then we could make our code more readable.

And also, someone should take the damned "K" out ;-)
Comment 12 Adam Barth 2010-04-02 17:20:24 PDT
> We should find out how practical his idea is.

I can try sketching out some code for this.

> If there is already an existing String or AtomicString, the converting it to a
> KURL would presumably require copying it even if we do squirrel the URL bits
> away inside the string buffer. But generally speaking if we can make a KURL
> almost the same cost as a string then we could make our code more readable.

You can avoid the extra copy with another malloc, but that doesn't seem like a good trade off.  Is the memory concern really with empty KURLs?  I wish there was a good way to measure how much memory we're talking about here.

> And also, someone should take the damned "K" out ;-)

:)
Comment 13 Darin Adler 2010-04-02 17:45:08 PDT
(In reply to comment #12)
> You can avoid the extra copy with another malloc, but that doesn't seem like a
> good trade off.  Is the memory concern really with empty KURLs?  I wish there
> was a good way to measure how much memory we're talking about here.

At least we could start by figuring out how much memory a KURL uses and how much memory the String for that KURL uses on some representative platform.
Comment 14 Darin Fisher (:fishd, Google) 2010-04-02 21:04:21 PDT
I should add, that I also find Maciej's idea interesting.  To summarize:

1- URLString contains a canonical, absolute URL string.
2- ParsedURL is constructed when you need to access/replace components from an URLString.

Constructing an URLString from a String would require instantiating a ParsedURL so that canonicalization could be done.  Constructing a ParsedURL from an URLString would be cheap because ParsedURL could assume the input is already canonical.  In such a world, KURL disappears in favor of either URLString or ParsedURL.

This proposal might result in some hot-spots in the code that lead to us passing around ParsedURL objects, but I suspect those will be the exception to the norm.

Reducing the memory size of KURL so that it can be used in more places is probably a fairly low cost and incremental change to the code base, but it may be unnecessary work given Maciej's idea above.
Comment 15 Darin Adler 2010-04-03 23:10:10 PDT
Darin, I’m a little lost. Starting implementation on what you call “Maciej’s idea” is what this bug was all about. How will we decide which way to go?
Comment 16 Darin Fisher (:fishd, Google) 2010-04-04 11:33:32 PDT
(In reply to comment #15)
> Darin, I’m a little lost. Starting implementation on what you call “Maciej’s
> idea” is what this bug was all about. How will we decide which way to go?

Right... when I originally commented in this bug, it was not clear to me that this was a first step toward that plan.  There's no mention of it here.  After some discussion on #webkit, it became clear to me that that was the case.

Based on some of the comments in this bug (in particular comment #8), it sounds like the idea of optimizing KURL so that it could be used everywhere (instead of having two representations for URLs) had not been fully considered.

How to best proceed is a good question.  I can see pros and cons both ways.