WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
144192
SharedBuffer::createWithContentsOfFile should use map file routines
https://bugs.webkit.org/show_bug.cgi?id=144192
Summary
SharedBuffer::createWithContentsOfFile should use map file routines
youenn fablet
Reported
2015-04-25 10:07:21 PDT
SharedBuffer::createWithContentsOfFile currently read data using standard I/O routines (open, read, close or similar according specific platforms). Using map files routines would optimize memory allocation.
Attachments
Patch
(13.78 KB, patch)
2015-04-25 13:01 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
Trying to fix mac build
(14.61 KB, patch)
2015-04-25 13:37 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
MappedFileData version
(19.62 KB, patch)
2015-04-27 07:58 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
Fixing build
(19.73 KB, patch)
2015-04-27 08:51 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
Using std::unique_ptr and adding unit tests
(37.33 KB, patch)
2015-04-30 08:06 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
move constructor approach
(36.65 KB, patch)
2015-05-04 12:17 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
Patch
(34.53 KB, patch)
2015-05-05 11:51 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
Constructor with boolean
(31.37 KB, patch)
2015-05-13 03:21 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
Style improvements
(30.73 KB, patch)
2015-05-13 05:30 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
Patch for landing
(30.60 KB, patch)
2015-05-14 00:00 PDT
,
youenn fablet
no flags
Details
Formatted Diff
Diff
Show Obsolete
(9)
View All
Add attachment
proposed patch, testcase, etc.
Darin Adler
Comment 1
2015-04-25 11:37:26 PDT
Note that mapping and copying the data are only equivalent if the file contents are not subsequently modified. So the semantics aren’t identical. So we might need to audit callers or change the function’s name.
youenn fablet
Comment 2
2015-04-25 13:01:33 PDT
Created
attachment 251644
[details]
Patch
youenn fablet
Comment 3
2015-04-25 13:37:04 PDT
Created
attachment 251647
[details]
Trying to fix mac build
Darin Adler
Comment 4
2015-04-25 17:07:30 PDT
Comment on
attachment 251647
[details]
Trying to fix mac build View in context:
https://bugs.webkit.org/attachment.cgi?id=251647&action=review
> Source/WebCore/platform/SharedBuffer.cpp:129 > + return static_cast<unsigned>(m_fileSize);
What guarantees m_fileSize will fit in 32 bits? I think this shows that it’s not really a good idea to store a size in size_t, since we’ll have to convert it to unsigned later anyway, and can’t handle values that are too large to fit in an unsigned.
> Source/WebCore/platform/SharedBuffer.cpp:194 > + char* fileData = m_fileData; > + m_fileData = nullptr;
There’s a nice way to write this kind of shuffle in modern C++: char* fileData = std::exchange(m_fileData, nullptr);
> Source/WebCore/platform/SharedBuffer.cpp:197 > + m_fileData = fileData; > + clearFileData();
This “swizzling the data pointer back to we can call clearFileData” thing is pretty ugly. A better pattern would be to make a struct or class that holds the data pointer, file descriptor, and size, then we could move the whole thing out of the SharedBuffer with MappedFileData fileData = WTF::move(m_fileData) and then let it get destroyed/cleaned up when it falls out of scope. class MappedFileData {
> Source/WebCore/platform/SharedBuffer.cpp:387 > +PassRefPtr<SharedBuffer> SharedBuffer::createWithContentsOfFile(const String& filePath)
New functions should not return PassRefPtr; this should return RefPtr. See <
http://www.webkit.org/coding/RefPtr.html
> for details.
> Source/WebCore/platform/SharedBuffer.cpp:390 > + if (filePath.isEmpty()) > + return nullptr;
Why is this needed? Won’t the open call below fail on this without a special case?
> Source/WebCore/platform/SharedBuffer.cpp:393 > + CString filename = fileSystemRepresentation(filePath); > + int fd = open(filename.data(), O_RDONLY);
No need for the local variable here: int fd = open(fileSystemRepresentation(filePath).data(), O_RDONLY);
> Source/WebCore/platform/SharedBuffer.cpp:404 > + if (fileStat.st_size < 0 || size != static_cast<unsigned long long>(fileStat.st_size)) {
This isn’t right. The type of st_size is not unsigned long long, it’s off_t. I think what we want to write here is: unsigned size = static_cast<unsigned>(fileStat.st_size); if (size != fileStat.st_size) { or if (fileStat.st_size < 0 || fileStat.st_size > std::numeric_limits<unsigned>::max()) { ... } unsigned size = static_cast<unsigned>(fileStat.st_size); I wonder if we want to special-case short files or empty files. File descriptors can be a scarce resource. Do we need to do anything to prevent from using too many?
> Source/WebCore/platform/SharedBuffer.cpp:409 > + char* data = static_cast<char*>(mmap(0, size, PROT_READ, MAP_FILE | MAP_SHARED, fd, 0));
I’m not sure I understand why we’re using a char* for all these types rather than a void*. I think void* would be better.
> Source/WebCore/platform/SharedBuffer.cpp:411 > + LOG_ERROR("Failed to map in memory file %s", filenameForDisplay(filePath).utf8().data());
I think it’s a little strange for the file system functions to log their errors. Should they? DO the others?
> Source/WebCore/platform/SharedBuffer.h:180 > +#if !PLATFORM(WIN) > + int m_fileDescriptor { 0 }; > +#endif > + char* m_fileData { nullptr }; > + size_t m_fileSize { 0 };
Why are m_fileData and m_fileSize outside the #if?
youenn fablet
Comment 5
2015-04-26 12:55:29 PDT
Thanks for the review. I will update the patch accordingly. I will put
bug 143711
as r? (In reply to
comment #4
)
> Comment on
attachment 251647
[details]
> Trying to fix mac build > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=251647&action=review
> > > Source/WebCore/platform/SharedBuffer.cpp:129 > > + return static_cast<unsigned>(m_fileSize); > > What guarantees m_fileSize will fit in 32 bits? I think this shows that it’s > not really a good idea to store a size in size_t, since we’ll have to > convert it to unsigned later anyway, and can’t handle values that are too > large to fit in an unsigned.
It is asserted in the constructor and checked in createWithContentsOfFile.
> > > Source/WebCore/platform/SharedBuffer.cpp:194 > > + char* fileData = m_fileData; > > + m_fileData = nullptr; > > There’s a nice way to write this kind of shuffle in modern C++: > > char* fileData = std::exchange(m_fileData, nullptr); > > > Source/WebCore/platform/SharedBuffer.cpp:197 > > + m_fileData = fileData; > > + clearFileData(); > > This “swizzling the data pointer back to we can call clearFileData” thing is > pretty ugly. A better pattern would be to make a struct or class that holds > the data pointer, file descriptor, and size, then we could move the whole > thing out of the SharedBuffer with MappedFileData fileData = > WTF::move(m_fileData) and then let it get destroyed/cleaned up when it falls > out of scope. > > class MappedFileData {
That is good. MappedFileData may be reused elsewhere in the future.
> > > Source/WebCore/platform/SharedBuffer.cpp:387 > > +PassRefPtr<SharedBuffer> SharedBuffer::createWithContentsOfFile(const String& filePath) > > New functions should not return PassRefPtr; this should return RefPtr. See > <
http://www.webkit.org/coding/RefPtr.html
> for details.
Well, it is not a new function, but I am ok with this refactoring.
> > Source/WebCore/platform/SharedBuffer.cpp:390 > > + if (filePath.isEmpty()) > > + return nullptr; > > Why is this needed? Won’t the open call below fail on this without a special > case?
It was part of the previous posix routine. OK with removing it.
> > > Source/WebCore/platform/SharedBuffer.cpp:393 > > + CString filename = fileSystemRepresentation(filePath); > > + int fd = open(filename.data(), O_RDONLY); > > No need for the local variable here: > > int fd = open(fileSystemRepresentation(filePath).data(), O_RDONLY); > > > Source/WebCore/platform/SharedBuffer.cpp:404 > > + if (fileStat.st_size < 0 || size != static_cast<unsigned long long>(fileStat.st_size)) { > > This isn’t right. The type of st_size is not unsigned long long, it’s off_t. > I think what we want to write here is: > > unsigned size = static_cast<unsigned>(fileStat.st_size); > if (size != fileStat.st_size) { > > or > > if (fileStat.st_size < 0 || fileStat.st_size > > std::numeric_limits<unsigned>::max()) { > ... > } > unsigned size = static_cast<unsigned>(fileStat.st_size); > > I wonder if we want to special-case short files or empty files. > > File descriptors can be a scarce resource. Do we need to do anything to > prevent from using too many?
Currently we do not need it but this might be an issue in the future. A FIXME or comment might be useful.
> > > Source/WebCore/platform/SharedBuffer.cpp:409 > > + char* data = static_cast<char*>(mmap(0, size, PROT_READ, MAP_FILE | MAP_SHARED, fd, 0)); > > I’m not sure I understand why we’re using a char* for all these types rather > than a void*. I think void* would be better.
void* and size_t are input/output to mmap/munmap. Hence why MappedFileData could store these as void*/size_t
> > > Source/WebCore/platform/SharedBuffer.cpp:411 > > + LOG_ERROR("Failed to map in memory file %s", filenameForDisplay(filePath).utf8().data()); > > I think it’s a little strange for the file system functions to log their > errors. Should they? DO the others?
This was part of the gtk version of SharedBuffer::create... I will check this further.
> > > Source/WebCore/platform/SharedBuffer.h:180 > > +#if !PLATFORM(WIN) > > + int m_fileDescriptor { 0 }; > > +#endif > > + char* m_fileData { nullptr }; > > + size_t m_fileSize { 0 }; > > Why are m_fileData and m_fileSize outside the #if?
This seemed useful to WIN platform as well and this allowed limiting #ifdef WIN in SharedBuffer.cpp. This will be easier to handle in MappedFileData.
youenn fablet
Comment 6
2015-04-27 07:58:41 PDT
Created
attachment 251740
[details]
MappedFileData version
youenn fablet
Comment 7
2015-04-27 08:51:57 PDT
Created
attachment 251744
[details]
Fixing build
youenn fablet
Comment 8
2015-04-27 09:38:00 PDT
> I wonder if we want to special-case short files or empty files. > > File descriptors can be a scarce resource. Do we need to do anything to > prevent from using too many?
Current mapped files through SharedBuffer::createWithContentsOfFile are released quite quickly. I added a FIXME.
> > Source/WebCore/platform/SharedBuffer.cpp:411 > > + LOG_ERROR("Failed to map in memory file %s", filenameForDisplay(filePath).utf8().data()); > > I think it’s a little strange for the file system functions to log their > errors. Should they? DO the others?
I saw some in linux based files but most don't so I removed them. I see some value in logging open errors, the fact that a file is missing in particular.
Darin Adler
Comment 9
2015-04-27 09:43:31 PDT
(In reply to
comment #8
)
> > I think it’s a little strange for the file system functions to log their > > errors. Should they? DO the others? > > I saw some in linux based files but most don't so I removed them. > I see some value in logging open errors, the fact that a file is missing in > particular.
I’m not so sure. That makes assumptions about what the higher level code is doing. What if it’s expected that the file is usually missing and we end up with logging every time? To be honest I’m not sure either way.
Carlos Garcia Campos
Comment 10
2015-04-27 09:48:14 PDT
Comment on
attachment 251744
[details]
Fixing build View in context:
https://bugs.webkit.org/attachment.cgi?id=251744&action=review
> Source/WebCore/platform/FileSystem.cpp:150 > + return adoptRef(new MappedFileData(fd, data, size));
Do we really need to keep the fd around? I think we can close the fd right after the mmap, the map will be alive until it's unmapped.
> Source/WebCore/platform/FileSystem.cpp:162 > +void MappedFileData::clear()
This is only used by the destructor, I think we can move this to the destructor instead.
> Source/WebCore/platform/FileSystem.cpp:164 > + if (m_fileData) {
This could be an early return.
> Source/WebCore/platform/FileSystem.cpp:170 > + m_fileSize = 0; > + m_fileData = nullptr; > + m_fileDescriptor = 0;
And we don't need all this in the destructor.
> Source/WebCore/platform/FileSystem.cpp:174 > +const void* MappedFileData::data()
This should be const
> Source/WebCore/platform/FileSystem.cpp:179 > +unsigned MappedFileData::size()
Ditto.
> Source/WebCore/platform/FileSystem.h:222 > + int m_fileDescriptor { 0 };
If we need to keep the file descriptor, it should be initialized to -1
> Source/WebCore/platform/SharedBuffer.cpp:99 > +SharedBuffer::SharedBuffer(MappedFileData& fileData)
This could probably receive a MappedFileData&& to transfer the ownership of the map.
> Source/WebCore/platform/SharedBuffer.cpp:189 > + maybeTransferMappedFileData(); > maybeTransferPlatformData();
The mapped file is mostly the same than platform data, I wonder if we could implement it as part of the platform data, with common code to create map, but using the platform data for that, so we don't need any change in SharedBuffer
> Source/WebCore/platform/SharedBuffer.cpp:385 > + RefPtr<MappedFileData> mappedFileData = MappedFileData::create(filePath);
Why does the MappedFileData class need to be refcounted? It's created here and owned by the SharedBuffer.
Darin Adler
Comment 11
2015-04-27 09:56:59 PDT
Comment on
attachment 251744
[details]
Fixing build View in context:
https://bugs.webkit.org/attachment.cgi?id=251744&action=review
Looking generally good, but I don’t think we should be using a heap-allocated reference counted object for MappedFileData.
> Source/WebCore/platform/FileSystem.cpp:171 > + if (m_fileData) { > + munmap(m_fileData, m_fileSize); > + close(m_fileDescriptor); > + > + m_fileSize = 0; > + m_fileData = nullptr; > + m_fileDescriptor = 0; > + }
This would read more clearly with an early return. Not sure why we are setting m_fileDescriptor to 0 here. I think 0 can be a valid file descriptor. I can imagine setting it to -1 here or just leaving it set to an arbitrary value.
> Source/WebCore/platform/FileSystem.cpp:182 > +const void* MappedFileData::data() > +{ > + return m_fileData; > +} > + > +unsigned MappedFileData::size() > +{ > + return m_fileSize; > +}
These should probably be inline in the header.
> Source/WebCore/platform/FileSystem.h:208 > +class MappedFileData: public RefCounted<MappedFileData> {
This shouldn’t be reference counted. Instead it should have move semantics. It should have a move constructor and move assignment operator. Missing space before the colon.
> Source/WebCore/platform/FileSystem.h:210 > + static RefPtr<MappedFileData> create(const String&);
I think this should just be a constructor, not a factory function. If the mapping fails, the MappedFileData can indicate the failure state, maybe by leaving m_fileDescriptor equal to -1. Maybe we’d have an isValid function that returns m_fileDescriptor >= 0.
> Source/WebCore/platform/FileSystem.h:214 > + void clear();
I don’t think we should have this function.
> Source/WebCore/platform/FileSystem.h:218 > +private:
I think the “private” need stop go inside the #if, but somehow it compiled on Windows.
> Source/WebCore/platform/FileSystem.h:222 > + int m_fileDescriptor { 0 };
This should be -1.
> Source/WebCore/platform/SharedBuffer.cpp:99 > +SharedBuffer::SharedBuffer(MappedFileData& fileData)
Should take a MappedFileData&& and move it in.
> Source/WebCore/platform/SharedBuffer.cpp:100 > + : m_size(0)
m_size should be initialized to 0 in the header, now here.
> Source/WebCore/platform/SharedBuffer.cpp:240 > + if (m_fileData) > + m_fileData = nullptr;
m_fileData = { };
> Source/WebCore/platform/SharedBuffer.cpp:383 > +RefPtr<SharedBuffer> SharedBuffer::createWithContentsOfFile(const String& filePath)
Where’s the #if that keeps this from compiling on Windows?
> Source/WebCore/platform/SharedBuffer.h:31 > +#include "FileSystem.h" > +
Should not have a blank line here between the "" includes and the <> includes.
youenn fablet
Comment 12
2015-04-30 08:06:07 PDT
Created
attachment 252058
[details]
Using std::unique_ptr and adding unit tests
youenn fablet
Comment 13
2015-04-30 08:15:59 PDT
(In reply to
comment #12
)
> Created
attachment 252058
[details]
> Using std::unique_ptr and adding unit tests
Thanks for all comments. I tried to integrate them in this patch. I sticked with the MappedFileData::create(In reply to
comment #12
)
> Created
attachment 252058
[details]
> Using std::unique_ptr and adding unit tests
Thanks for all comments. I tried integrating them in the latest patch. I sticked with MappedFileData::create to handle empty file (fileSize=0) vs error (null_ptr). Regarding platformData, I am not sure we can merge the two as the code paths are specific for each platform for platformData while for mappedData, code will hopefully be the same except for windows. I added some C++ unit tests. But maybe this can be landed separately: unit tests do not apply yet to EFL and tests will not pass for Win (checking whether they compile on Windows).
Darin Adler
Comment 14
2015-04-30 08:36:44 PDT
(In reply to
comment #13
)
> I sticked with MappedFileData::create to handle empty file (fileSize=0) vs > error (null_ptr).
That’s not a good choice, because it requires putting each MappedFileData object on the heap with new rather than putting it in a normal variable on the stack. Classes that force a certain kind of allocation are not a good pattern. I understand the desire to report errors in a clear way, but forcing heap allocation just so that we can report the error is not a good pattern.
youenn fablet
Comment 15
2015-05-04 12:17:38 PDT
Created
attachment 252323
[details]
move constructor approach
Darin Adler
Comment 16
2015-05-04 15:32:44 PDT
Comment on
attachment 252323
[details]
move constructor approach View in context:
https://bugs.webkit.org/attachment.cgi?id=252323&action=review
> Source/WebCore/platform/FileSystem.cpp:155 > + if (fileStat.st_size < 0 || fileStat.st_size > INT_MAX) {
A little strange to use INT_MAX here, but unsigned for the type of the size. I would have used std::numeric_limits<unsigned>::max().
> Source/WebCore/platform/FileSystem.cpp:163 > + if (!size) > + return;
Missing close(fd) here.
> Source/WebCore/platform/FileSystem.h:209 > + MappedFileData() { }
Should not need to implement this. Default implementation should work fine without declaration.
> Source/WebCore/platform/FileSystem.h:213 > + WEBCORE_EXPORT MappedFileData(MappedFileData&&);
Also need to implement operator=(MappedFileData&&) to prevent the default assignment operator from being defined.
> Source/WebCore/platform/FileSystem.h:217 > + WEBCORE_EXPORT void reset();
I would prefer to not have a separate reset function. I would just write this: fileData = { }; That would call the default constructor and the move assignment operator.
> Source/WebCore/platform/FileSystem.h:218 > + operator bool() const { return !!m_fileData; }
Should be explicit operator bool, or should omit this entirely.
> Source/WebCore/platform/SharedBuffer.cpp:39 > +#include <fcntl.h> > +#include <sys/mman.h> > +#include <sys/stat.h> > +#include <wtf/text/CString.h>
Why these includes? I don’t think any of them are needed any more.
> Source/WebCore/platform/SharedBuffer.cpp:99 > + : m_size(0)
Should initialize this in the class definition instead of here.
> Source/WebCore/platform/SharedBuffer.cpp:387 > + long long size; > + if (!getFileSize(filePath, size)) > + return nullptr;
Why do we need to get the file size twice? MappedFileData also gets the size. I suggest we write the code differently so we don’t get the size twice.
> Source/WebCore/platform/SharedBuffer.h:96 > - void clear(); > + WEBCORE_EXPORT void clear();
Why this new export?
> Source/WebCore/platform/SharedBuffer.h:105 > - PassRefPtr<SharedBuffer> copy() const; > + WEBCORE_EXPORT PassRefPtr<SharedBuffer> copy() const;
Why this new export?
> Source/WebCore/platform/SharedBuffer.h:133 > + SharedBuffer(MappedFileData&&);
Should be explicit.
youenn fablet
Comment 17
2015-05-05 11:51:53 PDT
Created
attachment 252393
[details]
Patch
youenn fablet
Comment 18
2015-05-05 11:54:10 PDT
Thanks for the review, latest patch should cover the different improvements you mentioned. MappedFileData is currently put in WebCore/platform, but it could also be put in WTF?
> > Source/WebCore/platform/SharedBuffer.h:96 > > - void clear(); > > + WEBCORE_EXPORT void clear(); > > Why this new export?
It is apparently required to compile the new API tests on Mac platform.
Darin Adler
Comment 19
2015-05-06 07:26:01 PDT
(In reply to
comment #18
)
> MappedFileData is currently put in WebCore/platform, but it could also be > put in WTF?
I wouldn’t suggest moving it down to that layer unless we move other things like FileSystem.h, but it’s fine to move it if we have a specific client that wants to use it in, say, JavaScriptCore.
Darin Adler
Comment 20
2015-05-06 08:48:59 PDT
Comment on
attachment 252393
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252393&action=review
This patch is OK as is, but I still see room for improvement. I didn’t carefully study all the SharedBuffer functions that this patch doesn’t touch to make sure that they all handle m_fileData correctly, but otherwise I looked closely at all the code.
> Source/WebCore/platform/FileSystem.cpp:125 > +void MappedFileData::swap(MappedFileData& fd)
In WebKit coding style we avoid argument names that are letters or letter clusters and use words instead. Here I would use the word "data" or the word "other".
> Source/WebCore/platform/FileSystem.cpp:131 > +MappedFileData::MappedFileData(MappedFileData&& fd)
This function should be inline in the header, since it has a trivial implementation that wed like to have be inlined.
> Source/WebCore/platform/FileSystem.cpp:136 > +MappedFileData& MappedFileData::operator=(MappedFileData&& fd)
This function should be inline in the header, since it has a trivial implementation that wed like to have be inlined.
> Source/WebCore/platform/FileSystem.cpp:138 > + swap(fd);
This is OK and it’s legal C++, but I slightly prefer an operator= implementation that clears out the other object rather than swapping. It makes a class that behaves a bit more logically. There are times when you want to write a = WTF::move(b) and be sure that b is left empty instead of possibly having state from a. I would write this: m_fileData = std::exchange(other.m_fileData, nullptr); m_fileSize = std::exchange(other.m_fileSize, 0); I’d do the equivalent with construction syntax for the move constructor, and I would omit the swap function entirely.
> Source/WebCore/platform/FileSystem.cpp:150 > +MappedFileData::~MappedFileData() > +{ > +#if !PLATFORM(WIN) > + if (!m_fileData) > + return; > + munmap(m_fileData, m_fileSize); > + m_fileData = nullptr; > +#endif > +}
This should go in FileSystemPOSIX.cpp: The other file system functions that use these APIs are in there.
> Source/WebCore/platform/FileSystem.cpp:153 > +bool MappedFileData::map(const String& filePath)
This should go in FileSystemPOSIX.cpp: All the other file system functions that use these APIs are in there.
> Source/WebCore/platform/FileSystem.cpp:155 > + int fd = open(fileSystemRepresentation(filePath).data(), O_RDONLY);
This is missing error handling for the case when fileSystemRepresentation returns a null string. You can see the proper handling of this, although it’s in unnecessarily wordy coding style, in FileSystemPOSIX.cpp.
> Source/WebCore/platform/FileSystem.h:209 > + MappedFileData() { }
We should not write this explicitly. The compiler will generate the same thing if we simply leave this line out. However we would need this if we decided to add another constructor (see my comment below).
> Source/WebCore/platform/FileSystem.h:211 > + WEBCORE_EXPORT bool map(const String&);
I’m not sure this design is the best way to go. There’s a bug in the implementation: If a client calls map a second time on a mapped file data that is already mapped, then it will leak the old mmap’d region. If we want to go with this design, then we need to fix the map function or at the very least require that callers not try to use it that way and assert that. And from a design elegance point of view, if there is member function named map then I think we should have another member function named unmap, like the reset function you had in an earlier patch that I asked you to remove. Another design would be to make the function that actually does the mapping be a constructor or a function that returns a MappedFileData rather than a function that mutates an existing MappedFileData. I prefer this approach; that way you can’t make the mistake of trying to map something that is already mapped. I suppose since we need an way to tell an error apart from an empty file we can either use multiple return values (error indication plus the MappedFileData), or a bool& out argument from the constructor to tell the caller whether it failed, or we can have an error state of MappedFileData, distinct from empty data. This could be distinguished with a separate boolean or with a special value for m_fileData or m_fileSize as long as we are careful about what the accessor member functions return in such a case. Finally, I don’t think the purpose of the string argument is clear without an argument name. We should call it filename or filePath, I think. I am not happy with the inconsistent terminology: "filename", "file name", "fileName", "path", "path name", "pathName", "filePath". I think we should chose one instead of mixing them all.
> Source/WebCore/platform/SharedBuffer.cpp:36 > +#if !PLATFORM(WIN) > +#include "FileSystem.h" > +#endif
We don’t need this include. It’s already included in the header.
> Source/WebCore/platform/SharedBuffer.cpp:95 > + : m_size(0)
Can we initialize m_size in the class definition in the header instead of doing it here, please?
> Source/WebCore/platform/SharedBuffer.cpp:96 > + , m_buffer(adoptRef(new DataBuffer))
It seems wasteful that we allocate a buffer that likely won’t be used. Can we make this get allocated when needed instead? Perhaps that would be inside the maybeTransferMappedFileData function.
> Source/WebCore/platform/SharedBuffer.cpp:235 > + if (m_fileData) > + m_fileData = { };
No need for the "if" here. Unconditionally doing the assignment would be fine performance-wise.
> Source/WebCore/platform/SharedBuffer.cpp:253 > RefPtr<SharedBuffer> clone { adoptRef(*new SharedBuffer) };
It’s peculiar that this makes a Ref and then puts it into a RefPtr. It’s pretty clear that we’ll want to change this function to return a Ref<SharedBuffer> instead of a PassRefPtr<SharedBuffer> later, but you don’t need to do that now.
> Source/WebCore/platform/SharedBuffer.cpp:386 > +RefPtr<SharedBuffer> SharedBuffer::createWithContentsOfFile(const String& filePath) > +{ > + MappedFileData mappedFileData; > + if (!mappedFileData.map(filePath)) > + return nullptr; > + return adoptRef(new SharedBuffer(WTF::move(mappedFileData))); > +}
I’m not sure it’s correct to entirely replace createWithContentsOfFile with a function that always does mmap. There are files that you can open and read successfully but can’t mmap. For example, I believe that files on removable media on OS X behave that way. This function should almost certainly either have a name that indicates it will only do file mapping, or a strategy of doing mapping if it can and falling back to file reading if it cannot.
> Source/WebCore/platform/SharedBuffer.h:96 > - void clear(); > + WEBCORE_EXPORT void clear();
I still don’t understand what changed to make this additional export necessary. I accept your claim that it is needed to make things compile, but I don’t know what triggered that new requirement.
> Source/WebCore/platform/SharedBuffer.h:105 > - PassRefPtr<SharedBuffer> copy() const; > + WEBCORE_EXPORT PassRefPtr<SharedBuffer> copy() const;
Ditto.
> Source/WebCore/platform/SharedBuffer.h:133 > + SharedBuffer(MappedFileData&&);
Should be marked explicit.
Darin Adler
Comment 21
2015-05-06 08:49:53 PDT
Comment on
attachment 252393
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252393&action=review
>> Source/WebCore/platform/SharedBuffer.h:96 >> + WEBCORE_EXPORT void clear(); > > I still don’t understand what changed to make this additional export necessary. I accept your claim that it is needed to make things compile, but I don’t know what triggered that new requirement.
Oh, wait, I do understand. You added tests and needed WEBCORE_EXPORT to make the function testable. OK.
Michael Catanzaro
Comment 22
2015-05-06 10:18:23 PDT
(In reply to
comment #20
)
> In WebKit coding style we avoid argument names that are letters or letter > clusters and use words instead. Here I would use the word "data" or the word > "other".
FWIW, if I see a variable named fd, I assume it's a file descriptor and would be confused if it's not an int.
youenn fablet
Comment 23
2015-05-06 12:46:07 PDT
(In reply to
comment #20
)
> Comment on
attachment 252393
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=252393&action=review
> > This patch is OK as is, but I still see room for improvement.
Thanks for the review. Given your point on unmappable files, I would tend to further improve that patch with a new review cycle. Let me know if you think a follow-up patch would be a better option.
> I didn’t carefully study all the SharedBuffer functions that this patch > doesn’t touch to make sure that they all handle m_fileData correctly, but > otherwise I looked closely at all the code.
This code follows the same mechanism as platformData (which I guess should be unified at some point with it).
> > Source/WebCore/platform/FileSystem.cpp:125 > > +void MappedFileData::swap(MappedFileData& fd) > > In WebKit coding style we avoid argument names that are letters or letter > clusters and use words instead. Here I would use the word "data" or the word > "other".
Yes, this is a poor name, will fix it.
> This is OK and it’s legal C++, but I slightly prefer an operator= > implementation that clears out the other object rather than swapping. It > makes a class that behaves a bit more logically. There are times when you > want to write a = WTF::move(b) and be sure that b is left empty instead of > possibly having state from a. I would write this: > > m_fileData = std::exchange(other.m_fileData, nullptr); > m_fileSize = std::exchange(other.m_fileSize, 0); > > I’d do the equivalent with construction syntax for the move constructor, and > I would omit the swap function entirely.
OK
> > Source/WebCore/platform/FileSystem.cpp:150 > > +MappedFileData::~MappedFileData() > > +{ > > +#if !PLATFORM(WIN) > > + if (!m_fileData) > > + return; > > + munmap(m_fileData, m_fileSize); > > + m_fileData = nullptr; > > +#endif > > +} > > This should go in FileSystemPOSIX.cpp: The other file system functions that > use these APIs are in there.
This would require to duplicate the same code in FileSystemGtk.cpp.
> > Source/WebCore/platform/FileSystem.cpp:155 > > + int fd = open(fileSystemRepresentation(filePath).data(), O_RDONLY); > > This is missing error handling for the case when fileSystemRepresentation > returns a null string. You can see the proper handling of this, although > it’s in unnecessarily wordy coding style, in FileSystemPOSIX.cpp.
OK, I will fix it.
> > Source/WebCore/platform/FileSystem.h:209 > > + MappedFileData() { } > > We should not write this explicitly. The compiler will generate the same > thing if we simply leave this line out. However we would need this if we > decided to add another constructor (see my comment below).
We already have the move constructor.
> > Source/WebCore/platform/FileSystem.h:211 > > + WEBCORE_EXPORT bool map(const String&); > > I’m not sure this design is the best way to go.
I am not very much satisfied with it either. If it were me, I would go with map/unmap methods.
> There’s a bug in the implementation: If a client calls map a second time on > a mapped file data that is already mapped, then it will leak the old mmap’d > region. If we want to go with this design, then we need to fix the map > function or at the very least require that callers not try to use it that > way and assert that. And from a design elegance point of view, if there is > member function named map then I think we should have another member > function named unmap, like the reset function you had in an earlier patch > that I asked you to remove.
Yes, that would be appealing to me. This would allow reuse of the same object for several mappings. But, from the comments I read, it does not seem the preferred approach.
> Another design would be to make the function that actually does the mapping > be a constructor or a function that returns a MappedFileData rather than a > function that mutates an existing MappedFileData. I prefer this approach; > that way you can’t make the mistake of trying to map something that is > already mapped. I suppose since we need an way to tell an error apart from > an empty file we can either use multiple return values (error indication > plus the MappedFileData), or a bool& out argument from the constructor to > tell the caller whether it failed, or we can have an error state of > MappedFileData, distinct from empty data. This could be distinguished with a > separate boolean or with a special value for m_fileData or m_fileSize as > long as we are careful about what the accessor member functions return in > such a case.
I do not really like to create objects just to return an error code. But I can live with it.
> Finally, I don’t think the purpose of the string argument is clear without > an argument name. We should call it filename or filePath, I think. I am not > happy with the inconsistent terminology: "filename", "file name", > "fileName", "path", "path name", "pathName", "filePath". I think we should > chose one instead of mixing them all.
OK
> > Source/WebCore/platform/SharedBuffer.cpp:95 > > + : m_size(0) > > Can we initialize m_size in the class definition in the header instead of > doing it here, please?
OK. I was planning to modify each SharedBuffer constructor accordingly in a follow-up patch, as it seems strange to have different constructor style.
> > Source/WebCore/platform/SharedBuffer.cpp:96 > > + , m_buffer(adoptRef(new DataBuffer)) > > It seems wasteful that we allocate a buffer that likely won’t be used. Can > we make this get allocated when needed instead? Perhaps that would be inside > the maybeTransferMappedFileData function.
Good point, will have a try at fixing it. This should also be modified I guess for platformData.
> > Source/WebCore/platform/SharedBuffer.cpp:235 > > + if (m_fileData) > > + m_fileData = { }; > > No need for the "if" here. Unconditionally doing the assignment would be > fine performance-wise.
OK
> > Source/WebCore/platform/SharedBuffer.cpp:386 > > +RefPtr<SharedBuffer> SharedBuffer::createWithContentsOfFile(const String& filePath) > > +{ > > + MappedFileData mappedFileData; > > + if (!mappedFileData.map(filePath)) > > + return nullptr; > > + return adoptRef(new SharedBuffer(WTF::move(mappedFileData))); > > +} > > I’m not sure it’s correct to entirely replace createWithContentsOfFile with > a function that always does mmap. There are files that you can open and read > successfully but can’t mmap. For example, I believe that files on removable > media on OS X behave that way. > > This function should almost certainly either have a name that indicates it > will only do file mapping, or a strategy of doing mapping if it can and > falling back to file reading if it cannot.
I see. I am not sure we can know in advance whether map will work or not. Falling back to file reading seems a good approach. If we agree this is the right approach, that patch will change substantially and I guess it will need a new review cycle. If there is no objection, I will prepare a new patch accordingly.
> > Source/WebCore/platform/SharedBuffer.h:133 > > + SharedBuffer(MappedFileData&&); > > Should be marked explicit.
OK
Darin Adler
Comment 24
2015-05-06 16:29:21 PDT
Comment on
attachment 252393
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252393&action=review
>>> Source/WebCore/platform/FileSystem.h:209 >>> + MappedFileData() { } >> >> We should not write this explicitly. The compiler will generate the same thing if we simply leave this line out. However we would need this if we decided to add another constructor (see my comment below). > > We already have the move constructor.
Special constructors such as the copy constructor and the move constructor do not prevent the empty constructor from being automatically generated. So the move constructor doesn’t have an effect here. But a non-special constructor would.
>>> Source/WebCore/platform/SharedBuffer.cpp:386 >>> +} >> >> I’m not sure it’s correct to entirely replace createWithContentsOfFile with a function that always does mmap. There are files that you can open and read successfully but can’t mmap. For example, I believe that files on removable media on OS X behave that way. >> >> This function should almost certainly either have a name that indicates it will only do file mapping, or a strategy of doing mapping if it can and falling back to file reading if it cannot. > > I see. > I am not sure we can know in advance whether map will work or not. > Falling back to file reading seems a good approach. > If we agree this is the right approach, that patch will change substantially and I guess it will need a new review cycle. > If there is no objection, I will prepare a new patch accordingly.
Exactly, we can’t know in advance. New patch seems fine.
youenn fablet
Comment 25
2015-05-12 12:10:39 PDT
> > > Source/WebCore/platform/SharedBuffer.cpp:96 > > > + , m_buffer(adoptRef(new DataBuffer)) > > > > It seems wasteful that we allocate a buffer that likely won’t be used. Can > > we make this get allocated when needed instead? Perhaps that would be inside > > the maybeTransferMappedFileData function.
After some checking, it seems to require adding more "if (!m_buffer)" than what I would like in various place. I am not sure it is worth the extra complexity. I guess a better design would be to split more cleanly the different underlying implementations (subclasses?). A first step might be to move the append() functionality outside SharedBuffer.
youenn fablet
Comment 26
2015-05-13 03:21:39 PDT
Created
attachment 253029
[details]
Constructor with boolean
youenn fablet
Comment 27
2015-05-13 05:30:58 PDT
Created
attachment 253031
[details]
Style improvements
youenn fablet
Comment 28
2015-05-13 07:54:36 PDT
Comment on
attachment 253031
[details]
Style improvements View in context:
https://bugs.webkit.org/attachment.cgi?id=253031&action=review
> Source/WebCore/platform/FileSystem.cpp:128 > + if (!m_fileData)
Kept map/unmap implementation here so as to support GTK, EFL and Mac at the same time. If we want FileSystem.cpp to remain platform free, we may need another FileSystemXX.cpp file.
> Source/WebCore/platform/FileSystem.cpp:131 > + m_fileData = nullptr;
Plan to remove that line.
> Source/WebCore/platform/FileSystem.cpp:135 > +MappedFileData::MappedFileData(const String& filePath, bool& success)
success value could have been an enumeration value to detail the error type. In particular, if file does not exist, no need to try the fallback in SharedBuffer::create... I kept the boolean approach as it is simpler and seems good enough.
> Source/WebCore/platform/SharedBuffer.cpp:88 > + : m_buffer(adoptRef(new DataBuffer))
Not removed as fixing this would add more complexity than I would like
> Source/WebCore/platform/SharedBuffer.cpp:107 > + return SharedBuffer::create();
These two lines do not add much so I plan to remove them.
Darin Adler
Comment 29
2015-05-13 12:17:12 PDT
Comment on
attachment 253031
[details]
Style improvements View in context:
https://bugs.webkit.org/attachment.cgi?id=253031&action=review
Dan Bernstein pointed out that mapping a file is different from reading the contents of the file in one other sense; if the file is changed we will see the new contents rather than the original.
>> Source/WebCore/platform/FileSystem.cpp:131 >> + m_fileData = nullptr; > > Plan to remove that line.
I don’t understand what you mean. When will you remove that line? Some subsequent patch? Before landing this?
> Source/WebCore/platform/FileSystem.cpp:140 > + return;
Please remove this line.
> Source/WebCore/platform/FileSystem.h:212 > + WEBCORE_EXPORT MappedFileData(MappedFileData&&);
This should not have WEBCORE_EXPORT on it because it’s defined in this header.
> Source/WebCore/platform/FileSystem.h:229 > +inline MappedFileData::MappedFileData(MappedFileData&& other) > +{ > + m_fileData = std::exchange(other.m_fileData, nullptr); > + m_fileSize = std::exchange(other.m_fileSize, 0); > +}
Should use construction syntax instead of assignment syntax. : m_fileData(std::exchange(other.m_fileData, nullptr)) , m_fileSize(std::exchange(other.m_fileSize, 0))
>> Source/WebCore/platform/SharedBuffer.cpp:107 >> + if (!mappedFileData) >> + return SharedBuffer::create(); > > These two lines do not add much so I plan to remove them.
I agree, you should.
youenn fablet
Comment 30
2015-05-14 00:00:57 PDT
Created
attachment 253102
[details]
Patch for landing
WebKit Commit Bot
Comment 31
2015-05-14 00:59:21 PDT
Comment on
attachment 253102
[details]
Patch for landing Clearing flags on attachment: 253102 Committed
r184331
: <
http://trac.webkit.org/changeset/184331
>
WebKit Commit Bot
Comment 32
2015-05-14 00:59:27 PDT
All reviewed patches have been landed. Closing bug.
Brady Eidson
Comment 33
2015-05-15 10:18:50 PDT
This patch has caused the Network Process running on Mac to assert often. Exploring more now...
Brady Eidson
Comment 34
2015-05-15 10:30:34 PDT
In NetworkResourceLoader, we create an empty SharedBuffer: if (NetworkCache::singleton().isEnabled()) m_bufferedDataForCache = WebCore::SharedBuffer::create(); Later, we try to append to it inside maybeAppendDataArray: unsigned originalSize = size(); for (auto& cfData : data->m_dataArray) append(cfData.get()); ASSERT(size() == originalSize + data->size()); The problem is originalSize is some huge amount, because the ::size() method things it should get the m_fileData size, even though it really shouldn't.
Brady Eidson
Comment 35
2015-05-15 10:59:56 PDT
(In reply to
comment #34
)
> In NetworkResourceLoader, we create an empty SharedBuffer: > if (NetworkCache::singleton().isEnabled()) > m_bufferedDataForCache = WebCore::SharedBuffer::create(); > > Later, we try to append to it inside maybeAppendDataArray: > unsigned originalSize = size(); > for (auto& cfData : data->m_dataArray) > append(cfData.get()); > ASSERT(size() == originalSize + data->size()); > > The problem is originalSize is some huge amount, because the ::size() method > things it should get the m_fileData size, even though it really shouldn't.
After debugging this a bit, I lost my WebKit disk cache, and it stopped reproducing. That's very unfortunate.
youenn fablet
Comment 36
2015-05-15 12:18:46 PDT
(In reply to
comment #34
)
> In NetworkResourceLoader, we create an empty SharedBuffer: > if (NetworkCache::singleton().isEnabled()) > m_bufferedDataForCache = WebCore::SharedBuffer::create(); > > Later, we try to append to it inside maybeAppendDataArray: > unsigned originalSize = size(); > for (auto& cfData : data->m_dataArray) > append(cfData.get()); > ASSERT(size() == originalSize + data->size()); > > The problem is originalSize is some huge amount, because the ::size() method > things it should get the m_fileData size, even though it really shouldn't.
That seems weird. I wonder whether that is a mac specific issue or if it also happens in other gtk/efl SharedBuffer specific parts. Is there a place or some tests to look for those assertion failures in particular?
Darin Adler
Comment 37
2015-05-19 08:36:47 PDT
Brady, it’s possible your problem was due to that bug in Xcode recompilation where it gets offsets incorrect even though header files has changed. If you ever see it after a clean build that would disprove my theory.
Darin Adler
Comment 38
2015-05-19 08:37:29 PDT
I’m going to look over the code and try to make other guesses about why an empty shared buffer would have a m_fileData that’s not empty.
Brady Eidson
Comment 39
2015-05-19 08:42:17 PDT
(In reply to
comment #38
)
> I’m going to look over the code and try to make other guesses about why an > empty shared buffer would have a m_fileData that’s not empty.
I did the same and did not spot anything :( (In reply to
comment #37
)
> Brady, it’s possible your problem was due to that bug in Xcode recompilation > where it gets offsets incorrect even though header files has changed. If you > ever see it after a clean build that would disprove my theory.
You might be correct. When I nuked my disk cache (unwisely not saving it aside...) I may have also done enough of a rebuild to make it go away. I'm not sure there's anything emergent to explore here, but I will definitely chime in if I see it again.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug