Summary: | Allow FileSystem API implementation to pass snapshot metadata at File creation time | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Kinuko Yasuda <kinuko> | ||||||||||||||||||||||
Component: | WebCore Misc. | Assignee: | Kinuko Yasuda <kinuko> | ||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||
Severity: | Normal | CC: | eric.carlson, ericu, feature-media-reviews, fishd, gustavo, jianli, michaeln, pnormand, satorux, tzik, webkit.review.bot, xan.lopez | ||||||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||
Bug Depends on: | 86745 | ||||||||||||||||||||||||
Bug Blocks: | |||||||||||||||||||||||||
Attachments: |
|
Description
Kinuko Yasuda
2012-02-16 23:24:18 PST
Created attachment 127541 [details]
Patch
Comment on attachment 127541 [details] Patch Attachment 127541 [details] did not pass gtk-ews (gtk): Output: http://queues.webkit.org/results/11540386 Comment on attachment 127541 [details] Patch Attachment 127541 [details] did not pass qt-ews (qt): Output: http://queues.webkit.org/results/11536503 Created attachment 127551 [details]
Patch
Comment on attachment 127551 [details] Patch Do you have the test for FileSystem snapshot? View in context: https://bugs.webkit.org/attachment.cgi?id=127551&action=review > Source/WebCore/fileapi/Blob.cpp:97 > + if (asFile()) Can you cache asFile since it is accessed more than once? > Source/WebCore/fileapi/File.h:108 > + mutable double m_snapshotModificationTime; m_snapshotTimestamp? I think we do not care if the timestamp is created from modification time or not. > Source/WebCore/fileapi/WebKitBlobBuilder.cpp:104 > + if (blob->asFile()) { ditto. > Source/WebCore/fileapi/WebKitBlobBuilder.cpp:107 > + blob->asFile()->maybeCaptureSnapshot(); This seems to change the semantics. We used to capture and use the latest snapshot when the blob is appended to the BlobBuilder. But now we use the same snapshot the first time that the blob is added to the builder. Created attachment 127801 [details]
Patch
Comment on attachment 127551 [details] Patch Thanks for reviewing! View in context: https://bugs.webkit.org/attachment.cgi?id=127551&action=review >> Source/WebCore/fileapi/Blob.cpp:97 >> + if (asFile()) > > Can you cache asFile since it is accessed more than once? Done. (Though compiler will likely produce the same code for both) >> Source/WebCore/fileapi/File.h:108 >> + mutable double m_snapshotModificationTime; > > m_snapshotTimestamp? I think we do not care if the timestamp is created from modification time or not. Sounds good, but in the current code we specifically set this value from metadata.modificationTime and return the value for lastModifiedDate so probably it'd be clearer to keep this as is. >> Source/WebCore/fileapi/WebKitBlobBuilder.cpp:104 >> + if (blob->asFile()) { > > ditto. Done. >> Source/WebCore/fileapi/WebKitBlobBuilder.cpp:107 >> + blob->asFile()->maybeCaptureSnapshot(); > > This seems to change the semantics. We used to capture and use the latest snapshot when the blob is appended to the BlobBuilder. But now we use the same snapshot the first time that the blob is added to the builder. Right. I made the change since the File API seems to be going to include more specific error for modified-after-snapshot cases and slice/BlobBuilder usage looked relatively safe to change (since we already take snapshot approach for them). ...but I don't want to introduce unwanted regressions either, so I just removed the line 138 in File.cpp. (TL;DR: Fixed.) Created attachment 127867 [details]
Patch
(In reply to comment #8) > Created an attachment (id=127867) [details] > Patch Removed some duplicated code and simplified the patch. (In reply to comment #9) > (In reply to comment #8) > > Created an attachment (id=127867) [details] [details] > > Patch > > Removed some duplicated code and simplified the patch. Friendly ping. Jian could you take another look? Thanks very much. Created attachment 128519 [details]
Patch
Added a test.
Comment on attachment 128519 [details] Patch It seems that we now have 2 different semantics when slicing a file into blob. For file-system file, we always use the snapshot that is obtained at the File object creation time. For real file, we always obtain and use the latest snapshot. Is this slicing behavior for file-system file somewhere described in the FileSystem spec? If not, can we also try to use the same behavior for file-system file? View in context: https://bugs.webkit.org/attachment.cgi?id=128519&action=review > Source/WebCore/fileapi/Blob.cpp:70 > + // FIXME: Calling size() and lastModifiedTimeInMicroseconds() may involve synchronous file operation. We need to figure out how to make it asynchronous or make File/Blob act as snapshot all the time as the new draft specifies. Now you split the call of captureSnapshot into 2 calls: size() and lastModifiedTimeInMicroseconds() and this might not guarantee that size and lastModifiedTime are for the same snapshot when slicing a file. Though current implementation of captureSnapshot also has this problem but we has a FIXME for this, your current change exaggerates this problem. Thanks. After some discussion this patch is no longer in a critical path for M19, so I'd like to revisit this sometime later. (In reply to comment #12) > (From update of attachment 128519 [details]) > It seems that we now have 2 different semantics when slicing a file into blob. For file-system file, we always use the snapshot that is obtained at the File object creation time. For real file, we always obtain and use the latest snapshot. Is this slicing behavior for file-system file somewhere described in the FileSystem spec? If not, can we also try to use the same behavior for file-system file? It's not really the spec'ed behavior, though I hoped changing only for files from FileSystem, possibly with some noisy FIXME comment, would be probably acceptable (at least it'd be harmless). > View in context: https://bugs.webkit.org/attachment.cgi?id=128519&action=review > > > Source/WebCore/fileapi/Blob.cpp:70 > > + // FIXME: Calling size() and lastModifiedTimeInMicroseconds() may involve synchronous file operation. We need to figure out how to make it asynchronous or make File/Blob act as snapshot all the time as the new draft specifies. > > Now you split the call of captureSnapshot into 2 calls: size() and lastModifiedTimeInMicroseconds() and this might not guarantee that size and lastModifiedTime are for the same snapshot when slicing a file. Though current implementation of captureSnapshot also has this problem but we has a FIXME for this, your current change exaggerates this problem. Ok, I'll revert the change when I resume working on this patch. Changed summary text. In general we should query File metadata (e.g. size and modifiedTime) on the fly, i.e. when it is queried, to keep the File semantics, but for some platform-specific filesystems it may not be feasible since synchronous metadata query could take very long time. We should allow each FileSystem API implementation to pass snapshot metadata so that File object could cache the given metadata not to make synchronous query. This should NOT be allowed for regular FileSystem types (i.e. Temporary or Persistent) not to break File API semantics. Created attachment 141372 [details]
Patch
Comment on attachment 141372 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=141372&action=review > Source/WebCore/Modules/filesystem/DOMFileSystem.cpp:126 > +class GetSnapshotFileCallback : public FileSystemCallbacksBase { I think it'd better to call it as GetMetadataCallback since we're only creating snapshot for certain scenario. > Source/WebCore/Modules/filesystem/DOMFileSystem.cpp:139 > + // For regular filesystem types (temporary or persistent) we should not cache file metadata as it could change File semantics. Add comma before we. > Source/WebCore/Modules/filesystem/DOMFileSystem.cpp:140 > + // For other filesystem types (which could be platform-specific ones) there's a chance that the files are on remote filesystem. If the port has returned metadata just pass it to File constructor (so we may cache the metadata). ditto. > Source/WebCore/Modules/filesystem/DOMFileSystemSync.cpp:118 > + // For regular filesystem types (temporary or persistent) we should not cache file metadata as it could change File semantics. ditto. > Source/WebCore/Modules/filesystem/DOMFileSystemSync.cpp:119 > + // For other filesystem types (which could be platform-specific ones) there's a chance that the files are on remote filesystem. If the port has returned metadata just pass it to File constructor (so we may cache the metadata). ditto. > Source/WebCore/Modules/mediastream/PeerConnection00.cpp:99 > + dictionary.get("has_video", audio); ?? > Source/WebCore/fileapi/Blob.cpp:78 > + long long size = m_size; It seems the original code is a bit more efficient. What do you need to change that? > Source/WebCore/fileapi/Blob.h:44 > +class File; Is this forward declaration needed? > Source/WebCore/fileapi/File.cpp:97 > + : Blob(createBlobDataForFileWithName(metadata.platformPath, name), -1) Should we also store both size and modification time, if provided by the remote system, in the blob data? Should we not pass -1 when caching the metadata? > Source/WebCore/fileapi/File.cpp:157 > +bool File::hasValidSnapshotMetadata() const It seems that hasValidSnapshotMetadata might'd better be exposed in FileMetadata. > Source/WebCore/fileapi/File.h:57 > + // For filesystem files. It takes metadata which could optionally give valid snapshot metadata, which will suppress future metadata query on this File object. The metadata argument must be used carefully if and only if it is infeasible to query file metadata on the fly (e.g. file is on remote filesystem). nit: grammar error. > Source/WebCore/fileapi/File.h:105 > + const FileMetadata m_snapshotMetadata; This might be duplicate. We could probable get the metadata directly out of the blob data. It seems that we can use a boolean flag to denote a snapshot file case. Created attachment 141933 [details]
Patch
Thanks, I updated the patch. View in context: https://bugs.webkit.org/attachment.cgi?id=141372&action=review >> Source/WebCore/Modules/filesystem/DOMFileSystem.cpp:126 >> +class GetSnapshotFileCallback : public FileSystemCallbacksBase { > > I think it'd better to call it as GetMetadataCallback since we're only creating snapshot for certain scenario. Fixed. >> Source/WebCore/Modules/filesystem/DOMFileSystem.cpp:139 >> + // For regular filesystem types (temporary or persistent) we should not cache file metadata as it could change File semantics. > > Add comma before we. Done. >> Source/WebCore/Modules/filesystem/DOMFileSystem.cpp:140 >> + // For other filesystem types (which could be platform-specific ones) there's a chance that the files are on remote filesystem. If the port has returned metadata just pass it to File constructor (so we may cache the metadata). > > ditto. Done. >> Source/WebCore/Modules/filesystem/DOMFileSystemSync.cpp:118 >> + // For regular filesystem types (temporary or persistent) we should not cache file metadata as it could change File semantics. > > ditto. Done. >> Source/WebCore/Modules/filesystem/DOMFileSystemSync.cpp:119 >> + // For other filesystem types (which could be platform-specific ones) there's a chance that the files are on remote filesystem. If the port has returned metadata just pass it to File constructor (so we may cache the metadata). > > ditto. Done. >> Source/WebCore/Modules/mediastream/PeerConnection00.cpp:99 >> + dictionary.get("has_video", audio); > > ?? Bad merge... dropped. >> Source/WebCore/fileapi/Blob.cpp:78 >> + long long size = m_size; > > It seems the original code is a bit more efficient. What do you need to change that? It doesn't make difference in binaries with most compilers. Anyway this change isn't really necessary so I reverted it. >> Source/WebCore/fileapi/Blob.h:44 >> +class File; > > Is this forward declaration needed? Nope, removed. >> Source/WebCore/fileapi/File.cpp:97 >> + : Blob(createBlobDataForFileWithName(metadata.platformPath, name), -1) > > Should we also store both size and modification time, if provided by the remote system, in the blob data? Should we not pass -1 when caching the metadata? Good point. I updated the code along with the line. >> Source/WebCore/fileapi/File.cpp:157 >> +bool File::hasValidSnapshotMetadata() const > > It seems that hasValidSnapshotMetadata might'd better be exposed in FileMetadata. I removed this method. (Now the validation check is a bit relaxed than the previous patch but I think it'd be ok) >> Source/WebCore/fileapi/File.h:57 >> + // For filesystem files. It takes metadata which could optionally give valid snapshot metadata, which will suppress future metadata query on this File object. The metadata argument must be used carefully if and only if it is infeasible to query file metadata on the fly (e.g. file is on remote filesystem). > > nit: grammar error. I revised the comment trying to make the sentence simpler. (Sorry I wasn't sure which part was particularly wrong) >> Source/WebCore/fileapi/File.h:105 >> + const FileMetadata m_snapshotMetadata; > > This might be duplicate. We could probable get the metadata directly out of the blob data. It seems that we can use a boolean flag to denote a snapshot file case. I changed the code to use Blob::m_size and only cache snapshot modificationTime. Comment on attachment 141933 [details] Patch Attachment 141933 [details] did not pass mac-ews (mac): Output: http://queues.webkit.org/results/12685746 Created attachment 141940 [details]
Patch
Comment on attachment 141940 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=141940&action=review > Source/WebCore/fileapi/File.cpp:68 > + contentType = MIMETypeRegistry::getWellKnownMIMETypeForExtension(fileSystemName.substring(index + 1)); Can this content type extraction code be shared among all these 3 static methods? > Source/WebCore/fileapi/File.cpp:146 > + if (m_size >= 0) Though this logic works, I think it is inclined to mistakes in the future since m_size is indeed defined and initialized in Blob. Suppose someone changes the default value for m_size or assign some value for m_size in Blob implementation, we will return m_size without doing query for normal filesystem files. I think it is safer to explicitly define a boolean flag in File to mean that a snapshot is provided. Or, can we solely use m_snapshotModificationTime to find out if the snapshot is valid? > Source/WebCore/fileapi/File.h:30 > +#include "FileMetadata.h" It seems that we can use forward declaration instead. > Source/WebCore/fileapi/File.h:57 > + // For filesystem files. It takes metadata argument, which may suppress future metadata query on this File object if it has valid information (i.e. non-zero modificationTime and/or non-negative file length). The metadata argument must be given carefully only when it is infeasible to query file metadata on the fly (e.g. file is on remote filesystem). Maybe simpler and easier to understand if this is rephrased to something like: If filesystem files lives in the remote filesystem, the port might pass the valid metadata (non-zero modificationTime and/or non-negative file size) and cache it in the File object. > LayoutTests/ChangeLog:9 > + * fast/filesystem/file-read-after-write.html: Added. This tests seems to purely verify that the metadata is not cached, right? > LayoutTests/fast/filesystem/file-metadata-after-write.html:59 > +function test(expect, actual) Can we rely on those test utils provided by js-test-pre.js? Created attachment 142155 [details]
Patch
Comment on attachment 141940 [details] Patch Thanks, I updated the patch. View in context: https://bugs.webkit.org/attachment.cgi?id=141940&action=review >> Source/WebCore/fileapi/File.cpp:68 >> + contentType = MIMETypeRegistry::getWellKnownMIMETypeForExtension(fileSystemName.substring(index + 1)); > > Can this content type extraction code be shared among all these 3 static methods? Done. >> Source/WebCore/fileapi/File.cpp:146 >> + if (m_size >= 0) > > Though this logic works, I think it is inclined to mistakes in the future since m_size is indeed defined and initialized in Blob. Suppose someone changes the default value for m_size or assign some value for m_size in Blob implementation, we will return m_size without doing query for normal filesystem files. I think it is safer to explicitly define a boolean flag in File to mean that a snapshot is provided. Or, can we solely use m_snapshotModificationTime to find out if the snapshot is valid? Good point. Adding a flag sounds good, but it still may not stop Blob to assign some value in the future... To make it explicit and parallel to snapshotModificationTime I just added snapshotSize field, assuming 64-bit increase per File object will be ok. >> Source/WebCore/fileapi/File.h:30 >> +#include "FileMetadata.h" > > It seems that we can use forward declaration instead. Done. >> Source/WebCore/fileapi/File.h:57 >> + // For filesystem files. It takes metadata argument, which may suppress future metadata query on this File object if it has valid information (i.e. non-zero modificationTime and/or non-negative file length). The metadata argument must be given carefully only when it is infeasible to query file metadata on the fly (e.g. file is on remote filesystem). > > Maybe simpler and easier to understand if this is rephrased to something like: > If filesystem files lives in the remote filesystem, the port might pass the valid metadata (non-zero modificationTime and/or non-negative file size) and cache it in the File object. Sounds good! Updated. >> LayoutTests/ChangeLog:9 >> + * fast/filesystem/file-read-after-write.html: Added. > > This tests seems to purely verify that the metadata is not cached, right? Yes. I added the explanation in the ChangeLog. >> LayoutTests/fast/filesystem/file-metadata-after-write.html:59 >> +function test(expect, actual) > > Can we rely on those test utils provided by js-test-pre.js? This is a tiny piece of code and I prefer making this test standalone, but if using it is still a recommended way I'll follow. Created attachment 142414 [details]
Patch
Comment on attachment 141940 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=141940&action=review >>> LayoutTests/fast/filesystem/file-metadata-after-write.html:59 >>> +function test(expect, actual) >> >> Can we rely on those test utils provided by js-test-pre.js? > > This is a tiny piece of code and I prefer making this test standalone, but if using it is still a recommended way I'll follow. On the second thought using existing function would be always nice, I updated the patch to use js-test-pre.js. Comment on attachment 142414 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=142414&action=review > Source/WebCore/fileapi/File.cpp:163 > + if (m_snapshotSize >= 0 && m_snapshotModificationTime) { This seems a bit inconsistent w/ if checks in lastModifiedDate() and size(). Your comment below says valid metadata is meant by non-zero modificationTime and/or non-negative file size. Can we have "or" scenario? If yes, please change && to ||. Otherwise, please always use "m_snapshotSize >= 0 && m_snapshotModificationTime" for all 3 checks. > Source/WebCore/fileapi/File.h:57 > + // If filesystem files live in the remote filesystem, the port might pass the valid metdata (non-zero modificationTime and/or non-negative file size) and cache in the File object. typo: metdata > Source/WebCore/fileapi/File.h:102 > +#if ENABLE(FILE_SYSTEM) Please comment that what scenario these 2 members should be used. Also please illustrate the default values. > LayoutTests/fast/filesystem/file-metadata-after-write-expected.txt:1 > +Writing 1234567890 to the file... Better to output a line of description to explain what we want to test here: Test that metadata is not cached in the regular temporary filesystem. > LayoutTests/fast/filesystem/file-metadata-after-write.html:84 > + layoutTestController.notifyDone(); nit: wrong indenting. > LayoutTests/fast/filesystem/file-metadata-after-write.html:90 > + nit: extra line Comment on attachment 142414 [details] Patch Thanks! View in context: https://bugs.webkit.org/attachment.cgi?id=142414&action=review >> Source/WebCore/fileapi/File.cpp:163 >> + if (m_snapshotSize >= 0 && m_snapshotModificationTime) { > > This seems a bit inconsistent w/ if checks in lastModifiedDate() and size(). Your comment below says valid metadata is meant by non-zero modificationTime and/or non-negative file size. Can we have "or" scenario? If yes, please change && to ||. Otherwise, please always use "m_snapshotSize >= 0 && m_snapshotModificationTime" for all 3 checks. By the comment I meant we check two values with '&&' for snapshot but only check each value in size()/lastModifiedDate(). For better consistency I'll change the code to always check with '&&' in all cases. Committed r117432: <http://trac.webkit.org/changeset/117432> Re-opened since this is blocked by 86745 Looks ok now. |