Bug 25376

Summary: Refactor localStorage code for use in multi-process browsers
Product: WebKit Reporter: Jeremy Orlow <jorlow>
Component: WebCore Misc.Assignee: Jeremy Orlow <jorlow>
Status: RESOLVED FIXED    
Severity: Normal CC: andersca, beidson, dglazkov, fishd, michaeln, playmobil, sam, yael, zwarich
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 26629, 26658, 26698, 26732    
Bug Blocks:    
Attachments:
Description Flags
The first patch.
none
Revision of the last patch
none
Patch 2 in this set
none
A _slight_ ref of patch 1
none
Patch 3 in this set
none
Patch 4 in this set
none
Patch 5 in this set
none
patch 1 - v3
none
patch 1 - v4
fishd: review-
patch 1 - v5
none
patch 1 - v6
fishd: review+
patch 2 - v2 none

Jeremy Orlow
Reported 2009-04-24 13:47:24 PDT
This is my first WebKit ever bug. I'm not sure how to assign it to myself and I'm also not exactly sure who to cc. Anyhow... This bug is to track the first step in getting localStorage working in Chromium. The high level plan was discussed offline with Brady, Maciej, Anders, and Sam. The basic idea is to have a clear split between a "front end" and a "back end" of localStorage. The two will call each other in such a way that a proxy layer can be added in between. In multi-process browsers (like Chromium) the proxy layer will be used to marshal the data across its IPC mechanisms. At first glance, there's not a huge amount of work necessary. The Storage class is more or less a front end and LocalStorageArea is more or less a back end already. The StorageEvent code isn't currently split up, but it shouldn't be hard to do. It might make sense to rename the *StorageArea classes to *StorageBackend since this is what they'll be. I'll continue to update this bug as I further scope the work. I'm hoping to start writing code for this in the next week or two. FYI: The Chromium side of things will be tracked here: http://crbug.com/4360
Attachments
The first patch. (40.38 KB, patch)
2009-06-03 17:48 PDT, Jeremy Orlow
no flags
Revision of the last patch (39.18 KB, patch)
2009-06-04 18:24 PDT, Jeremy Orlow
no flags
Patch 2 in this set (10.11 KB, patch)
2009-06-05 15:37 PDT, Jeremy Orlow
no flags
A _slight_ ref of patch 1 (28.35 KB, patch)
2009-06-05 16:04 PDT, Jeremy Orlow
no flags
Patch 3 in this set (39.97 KB, patch)
2009-06-05 19:11 PDT, Jeremy Orlow
no flags
Patch 4 in this set (34.15 KB, patch)
2009-06-09 01:12 PDT, Jeremy Orlow
no flags
Patch 5 in this set (44.51 KB, patch)
2009-06-12 00:13 PDT, Jeremy Orlow
no flags
patch 1 - v3 (39.55 KB, patch)
2009-06-16 15:56 PDT, Jeremy Orlow
no flags
patch 1 - v4 (39.46 KB, patch)
2009-06-16 17:44 PDT, Jeremy Orlow
fishd: review-
patch 1 - v5 (39.40 KB, patch)
2009-06-18 12:54 PDT, Jeremy Orlow
no flags
patch 1 - v6 (39.44 KB, patch)
2009-06-19 16:32 PDT, Jeremy Orlow
fishd: review+
patch 2 - v2 (11.66 KB, patch)
2009-06-22 15:46 PDT, Jeremy Orlow
no flags
Jeremy Orlow
Comment 1 2009-05-14 11:04:46 PDT
Created a proposal for this here: http://docs.google.com/Doc?id=dhs4g97m_8cwths74m
Jeremy Orlow
Comment 2 2009-06-03 17:48:45 PDT
Created attachment 30936 [details] The first patch. In preparation for combining SessionStorage and LocalStorage into a single StorageManager, move the syncing bits out of LocalStorage and into its own class. The next step is to combine LocalStorageArea and SessionStorageArea into one. Then SessionStorage and LocalStorage can be combined. Note that combining them will cut down on code paths that must be proxied, eliminate virtual dispatch, and allow Chromium to (eventually) write historical sessionStorage to disk when memory is getting tight. Also remove a couple bits of cruft including code for quotas which is unnecessary (since a meta-data db is unnecessary since you can just count bytes as you read the local storage databases into memory).
Jeremy Orlow
Comment 3 2009-06-04 18:24:37 PDT
Created attachment 30978 [details] Revision of the last patch Oops, a change I was making to 'contains' (which I've since undone) accidentally made it into the last patch. I think this fixes it. Would really appreciate if someone could take a look!
Michael Nordman
Comment 4 2009-06-04 19:25:39 PDT
I'm really not at all familiar with the localstore code, so I'm not in a great position to give you a real review. (Not that I have review rights anyway:) 28 count bytes as you read the local storage databases into memory This implies to me the entire db is slurped into memory. Is that right? 65 : m_path(path.copy()) nit: is the call to .copy() really necessary 68 if (!path.isEmpty()) 69 m_syncManager = StorageSyncManager::create(m_path); nit: use the same (m_)path variable in the test and create call 115 ASSERT(origin); 116 if (!origin) 117 return String(); nit: which is it, required or not? One of them can probably go. 124 return pathByAppendingComponent(m_path, origin->databaseIdentifier() + ".localstorage"); Oh, a unique database file per origin. I see this was pre-existing code. Do you know why there is one per origin instead of one for all? Just curious. WebCore/storage/StorageSyncManager.cpp line 80: extra whitespace here
Jeremy Orlow
Comment 5 2009-06-04 20:47:28 PDT
(In reply to comment #4) > I'm really not at all familiar with the localstore code, so I'm not in a great > position to give you a real review. (Not that I have review rights anyway:) > > > 28 count bytes as you read the local storage databases into memory > > This implies to me the entire db is slurped into memory. Is that right? Yes. This is how it's always been. > > 65 : m_path(path.copy()) > > nit: is the call to .copy() really necessary This is how the original code was, but I can take a look to see if it's OK to skip the .copy(). > 68 if (!path.isEmpty()) > 69 m_syncManager = StorageSyncManager::create(m_path); > > nit: use the same (m_)path variable in the test and create call Oops. Quite correct. > 115 ASSERT(origin); > 116 if (!origin) > 117 return String(); > > nit: which is it, required or not? One of them can probably go. ASSERT will tell you there's a problem in debug mode, but WebKit will crash a few lines later without the if statement. I don't know if there are actual cases where origin could be null or if this code was "just in case". Something I would like to do in the future is investigate this stuff, but I don't think it's a priority. > 124 return pathByAppendingComponent(m_path, origin->databaseIdentifier() + > ".localstorage"); > > Oh, a unique database file per origin. I see this was pre-existing code. > Do you know why there is one per origin instead of one for all? Just curious. Because there's a database per origin. > WebCore/storage/StorageSyncManager.cpp > line 80: extra whitespace here Noted. These really aren't enough changes for me to roll another patch as is, but I'll do it before this gets submitted. Thanks for reviewing, Michael.
Jeremy Orlow
Comment 6 2009-06-05 15:37:55 PDT
Created attachment 31014 [details] Patch 2 in this set I'm trying to make these patches as small and self contained as possible. I'll continue rolling them out while I wait on reviews for earlier ones. I'm now using git with a branch for each one, so I should be able to keep all the layers straight.....but prompt reviews are obviously going to make this easier on everyone. This is the second patch in the set. It takes another step towards abstracting out the sync process. In this case, it's getting rid of all the 'internalGetItem' (and associated calls) and instead replaces them with one virtual function call that'll block until the import is complete. I'm pretty sure that the behavior for setItem, removeItem, and clear are NOT safe. getItem and contains do have a small optimization that is legit, but I'm don't think it helps a whole lot in practice since, if it's not present right away, it'll block until the import is complete. I've made a FIXME comment about possible optimizations. I'd like to implement this after a couple more rounds of refactoring. Unlike most svn patches, this needs to be applied with 'patch -p1' instead of 'patch -p0'. I'll create a ChangeLog and finalize this patch once the first one is reviewed. I'm putting this (and probably more soon) up mostly as a FYI.
Jeremy Orlow
Comment 7 2009-06-05 16:04:44 PDT
Created attachment 31016 [details] A _slight_ ref of patch 1 A couple tiny style changes from the last patch. This is also now generated from 'git diff' which means that StorageSyncManager.* shows up just as an add rather than a copy from LocalStorage.*. This is significant because much of the code in this new file is actually unmodified...so the reviewers job is easier than it might otherwise appear. :-)
Jeremy Orlow
Comment 8 2009-06-05 16:32:11 PDT
I figured it might help reviewers if I explained what I'm doing in excruciating detail. High level overview of structure: *DOMWindow code* Calls LocalStorage/SessionStorage to get a storage area Creates a Storage object and returns it to the javascript implementation *LocalStorage/SessionStorage code* Keeps track of *StorageArea objects associated with each origin Handles part of the syncing issues *StorageArea/LocalStorageArea/SessionStorageArea* Uses StorageMap to keep track of the actual data Dispatches events Handles the rest of the syncing and concurrency issues *StorageMap* Keeps the string -> string map that is DOM Storage Where I'm headed: I want to reorganize everything by function. I want to have the event dispatching code in one place. I want the syncing code in one place. And I want the data mapping in one place. To do this, I intend to pull the factory part of LocalStorage/SessionStorage into a StorageManager. I want to pull the syncing part of LocalStorage into a StorageSyncManager. I want to pull LocalStorageArea syncing code either into that manager or (more likely) a StorageAreaSync class. I want to move the event dispatching into a StorageEventDispatcher class. This is important for setting up proxies across processes. For the StorageArea, StorageManager, and their helpers, the "real" implementation will live in Chromium's browser process (the main, unsandboxed one). For StorageEventDispatcher, the "real" implementation will live in the renderer process and its proxy will live in the browser process. If this doesn't make sense, let me know and I'll try to flesh it out further (or update the design doc :-) As of right now, here's what I have planned in terms of patches. This will probably change, but hopefully it can give you a high level overview of where I'm going with this: 1: Pull LocalStorage sync code into its own class. 2: Clean up the LocalStorageCode for handling the initial import. 3: Pull the rest of the sync code out of localStorageArea. 4: Pull the event dispatching code out of *StorageArea. 5: Combine LocalStorageArea and SessionStorageArea. Combine LocalStorage and SessionStorage into StorageManager. All of the heavy lifting will have already been done in other patches. 6: Made the actual proxy split. Once again, if you have questions PLEASE ASK! I really want to keep moving forward on this, so I'll take whatever time necessary to explain what I'm doing. :-)
Jeremy Orlow
Comment 9 2009-06-05 19:11:27 PDT
Created attachment 31024 [details] Patch 3 in this set Patch for item #3. (Moving the rest of the syncing code out of LocalStorageArea.) Once #1 and #2 are in, I'll make a ChangeLog for this one and add the new file to the visual studio project, make file, etc.
Jeremy Orlow
Comment 10 2009-06-09 01:12:11 PDT
Created attachment 31085 [details] Patch 4 in this set Combine LocalStorageArea and SessionStorageArea into StorageArea. Next up, combine LocalStorage and SessionStorage into StorageManager.
Jeremy Orlow
Comment 11 2009-06-12 00:13:37 PDT
Created attachment 31193 [details] Patch 5 in this set Merge SessionStorage and LocalStorage into a single StorageNamespace class (since that's essentially what they both were to begin with). The next step is to add a proxy for StorageNamespace and StorageArea.
Cameron Zwarich (cpst)
Comment 12 2009-06-15 23:38:17 PDT
Why do we need some complicated multiprocess sync code in all local storage code when there is no way to build the WebKit tree in a multiprocess form?
Jeremy Orlow
Comment 13 2009-06-16 05:16:44 PDT
Chromium is a WebKit based browser that makes use of a multi-process based architecture. You can find more details here: http://dev.chromium.org/developers/design-documents/multi-process-architecture Please take a look at the patches I've posted so far. I think the architecture is actually simpler after them. Stuff like syncing code is broken out into its own area. What remained could then be combined and some virtual dispatch and tricks like having an "internalFunctionFoo" could be eliminated. As for "some complicated multiprocess sync code," I actually haven't posted any multi-process specific code yet. It will make things more complex, but it's important so that Chromium and the rest of WebKit can continue sharing as much code as possible. A couple of us met with a bunch of Apple guys (all of them in the cc list except for Maciej) to talk through the general strategy before. One of the things we talked about was moving in the direction where it's easier for others to use WebKit in a multi-process manner. We've also discussed some of the details on the webkit-dev mailing list. (I can dig up some of the thread names if that'd be helpful.) I think the benefits of sharing code will outweigh the negatives of added complexity. And I think the 5 patches I've posted so far will help reduce this added complexity.
Jeremy Orlow
Comment 14 2009-06-16 15:56:30 PDT
Created attachment 31389 [details] patch 1 - v3 Synced the repo and generated the patch with svn-create-patch. This should help the readability of StorageSyncManager.* (since it started as a copy of LocalStorage.*).
Jeremy Orlow
Comment 15 2009-06-16 17:44:03 PDT
Created attachment 31395 [details] patch 1 - v4 Added ENABLE(DOM_STORAGE) guards around the new files.
Jeremy Orlow
Comment 16 2009-06-16 18:45:19 PDT
Note that I've generated newer versions of the other patches (after syncing with head and adding guards) but they're almost the same, so I'm not going to bother uploading them unless someone expresses any amount of interest.
Darin Fisher (:fishd, Google)
Comment 17 2009-06-17 22:09:51 PDT
Comment on attachment 31395 [details] patch 1 - v4 This refactoring seems like a good idea to me. Reading over this change, I spotted some stylistic things you'll want to fix. Marking R- on account of those. > Index: WebCore/storage/LocalStorage.cpp ... > LocalStorage::LocalStorage(const String& path) > : m_path(path.copy()) > + , m_syncManager(0) nit: no need for this line since RefPtr's default constructor is equivalent. it is commonplace in webcore to exclude lines like this. > { > - // If the path is empty, we know we're never going to be using the thread for anything, so don't start it. > - // In the future, we might also want to consider removing it from the DOM in that case - <rdar://problem/5960470> > - if (path.isEmpty()) > - return; Can you comment on why it makes sense to delete this comment? > @@ -92,54 +88,18 @@ PassRefPtr<StorageArea> LocalStorage::st ... > - storageArea = LocalStorageArea::create(origin, this); > + nit: ^^^ looks like some extraneous white spaces on that new line. > Index: WebCore/storage/LocalStorageArea.cpp ... > void LocalStorageArea::scheduleFinalSync() > { > + ASSERT(isMainThread()); > + if (!m_syncManager) > + return; > + nit: ^^^ looks like some extraneous white spaces on that new line. > Index: WebCore/storage/StorageSyncManager.cpp ... > +void StorageSyncManager::close() ... > + nit: ^^^ looks like some extraneous white spaces on that new line. > +bool StorageSyncManager::scheduleImport(PassRefPtr<LocalStorageArea> area) > { > ASSERT(isMainThread()); > - > + nit: ^^^ looks like some extraneous white spaces on that new line. > +void StorageSyncManager::scheduleSync(PassRefPtr<LocalStorageArea> area) > { > ASSERT(isMainThread()); > + nit: ^^^ looks like some extraneous white spaces on that new line. > if (m_thread) > m_thread->scheduleSync(area); > } > Index: WebCore/storage/StorageSyncManager.h ... > + class StorageSyncManager: public ThreadSafeShared<StorageSyncManager> { nit: need a space before the ":"
Jeremy Orlow
Comment 18 2009-06-18 12:54:07 PDT
Created attachment 31504 [details] patch 1 - v5 (In reply to comment #17) > > { > > - // If the path is empty, we know we're never going to be using the thread for anything, so don't start it. > > - // In the future, we might also want to consider removing it from the DOM in that case - <rdar://problem/5960470> > > - if (path.isEmpty()) > > - return; > > Can you comment on why it makes sense to delete this comment? The comment just doesn't really fit in anywhere now. The patch check is gone since the manager now handles that stuff and the comment doesn't reflect the current thinking. (https://bugs.webkit.org/show_bug.cgi?id=25894) Once the re-factoring is done, it'll be easier to fix the behavior for WebKit while allowing browsers (with different philosophies) to opt out. Doing so earlier will just make the diffs a lot more messy. And given that this has been a problem since the beginning, I don't think it's a big deal to wait a bit on it. :-) > > @@ -92,54 +88,18 @@ PassRefPtr<StorageArea> LocalStorage::st > ... > > - storageArea = LocalStorageArea::create(origin, this); > > + > > nit: ^^^ looks like some extraneous white spaces on that new line. The spacing is as it was originally...I just removed an extra tab it appears. I think the spacing is proper as it is now. > > > > Index: WebCore/storage/LocalStorageArea.cpp > ... > > void LocalStorageArea::scheduleFinalSync() > > { > > + ASSERT(isMainThread()); > > + if (!m_syncManager) > > + return; > > + > > nit: ^^^ looks like some extraneous white spaces on that new line. I guess I could remove it, but that code is logically grouped together. It's doing sanity checks and deciding if it should bail completely...essentially testing assumptions which are made throughout the rest of the function. I don't like how it reads when you remove the space, but I'll do it if you insist. > > > > Index: WebCore/storage/StorageSyncManager.cpp > ... > > +void StorageSyncManager::close() > ... > > + > > nit: ^^^ looks like some extraneous white spaces on that new line. > > > > +bool StorageSyncManager::scheduleImport(PassRefPtr<LocalStorageArea> area) > > { > > ASSERT(isMainThread()); > > - > > + > > nit: ^^^ looks like some extraneous white spaces on that new line. > > > > +void StorageSyncManager::scheduleSync(PassRefPtr<LocalStorageArea> area) > > { > > ASSERT(isMainThread()); > > + > > nit: ^^^ looks like some extraneous white spaces on that new line. Same as above...I'm putting a space between where I'm testing assumptions and where the real code is. This was done often in the code I started with, but I did add in a bunch of spaces to make it all consistent. I went ahead and made all the changes you asked for because making progress is more important than these nits, but I think they've decreased readability and consistency.
Jeremy Orlow
Comment 19 2009-06-19 16:32:39 PDT
Created attachment 31571 [details] patch 1 - v6 Realized I misinterpreted Darin's comments. Added back in the new lines without the trailing spaces. I'm pretty sure this is ready to +. :-)
Brady Eidson
Comment 20 2009-06-22 14:56:25 PDT
Sorry I've been absent through this whole process - I'd like to take a thorough look before this lands, which I am doing right now. I'll be more available from here on out!
Darin Fisher (:fishd, Google)
Comment 21 2009-06-22 15:03:31 PDT
> Sorry I've been absent through this whole process - I'd like to take a thorough > look before this lands, which I am doing right now. I'll be more available > from here on out! Looks like I just missed your message :( I went ahead and committed this patch as http://trac.webkit.org/changeset/44959. I'm happy to revert or revise as you suggest.
Jeremy Orlow
Comment 22 2009-06-22 15:13:29 PDT
If it's ok, I'd like to keep this in for now (so that I can prepare the next patch). These patches are really affecting my ability to get LocalStorage working in Chromium, and that in turn is holding up other people/projects (like extensions). I really did want you feedback before we started checking anything in, but I felt somewhat backed into a corner. There still are plenty of other patches to fix any issues you might have with the first patch, and we can always back the first out if there's some major architectural flaw. As I've mentioned (via private mail) I'm more than happy to come by Apple to talk these patches through in person, on IRC, or whatever. I know in person isn't normal, but it'd probably be most efficient (and an excuse for me to ride my motorcycle :-).
Brady Eidson
Comment 23 2009-06-22 15:44:32 PDT
None of that was the issue - timing and availability was all. Please keep me CC'ed on future patches going forward.
Jeremy Orlow
Comment 24 2009-06-22 15:46:53 PDT
Created attachment 31685 [details] patch 2 - v2 Simplify the interaction between LocalStorageArea/SessionStorageArea and StorageArea by creating a "blockUntilImportComplete()" function rather than bouncing back and forth between the child and parent classes in a somewhat unintuitive manner. This patch also paves the way for LocalStorageArea and SessionStorageArea being merged into one. It's part of several in a set which are separating syncing (to disk) code from the rest of the implementation so that local storage and session storage's code can be unified.
Brady Eidson
Comment 25 2009-06-22 15:55:07 PDT
In general we try not to keep reusing the same back for a sequence of patches. One bug -> one patch makes it possible to track changes. Could we try opening a new bug for the next patch(es) in the sequence please?
Jeremy Orlow
Comment 26 2009-06-22 15:57:43 PDT
So f(In reply to comment #23) > Please keep me CC'ed on future patches going forward. Pretty much all the work is happening in this bug, but I'll definitely keep that in mind in the future. Here are the only bugs about LocalStorage that _might_ have been of interest: https://bugs.webkit.org/show_bug.cgi?id=26356 https://bugs.webkit.org/show_bug.cgi?id=25970 These are even less related: https://bugs.webkit.org/show_bug.cgi?id=26154 https://bugs.webkit.org/show_bug.cgi?id=26180 P.S. It'd be awesome if you can take a look at the next patch today. (It's pretty straightforward.)
Dimitri Glazkov (Google)
Comment 27 2009-06-22 16:01:25 PDT
Jeremy, stop monopolizing bradee-oh's attention! :) We need him for this bug: https://bugs.webkit.org/show_bug.cgi?id=26054
Jeremy Orlow
Comment 28 2009-06-22 16:02:38 PDT
Moved discussion of the second patch to a new thread as requested: https://bugs.webkit.org/show_bug.cgi?id=26629 I'm leaving patches 3-5 here for reference (until I cut a "real" version of each).
Jeremy Orlow
Comment 29 2009-06-23 13:47:15 PDT
Adding the broken out bugs as dependencies.
Eric Seidel (no email)
Comment 30 2009-06-29 13:20:30 PDT
So is this bug still waiting to be landed? Or should it be closed?
Note You need to log in before you can comment on or make changes to this bug.