Bug 58762 - localStorage string values truncated at \x00 characters on browser restart
Summary: localStorage string values truncated at \x00 characters on browser restart
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: Macintosh Intel OS X 10.6
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-04-17 22:58 PDT by Joshua Bell
Modified: 2012-02-10 10:18 PST (History)
8 users (show)

See Also:


Attachments
patch (2.51 KB, patch)
2011-06-14 08:42 PDT, Jonathan Dong
no flags Details | Formatted Diff | Diff
patch (8.09 KB, patch)
2011-07-05 18:51 PDT, Jonathan Dong
jorlow: review-
Details | Formatted Diff | Diff
patch (8.46 KB, patch)
2011-07-11 09:55 PDT, Jonathan Dong
ap: commit-queue-
Details | Formatted Diff | Diff
patch (8.45 KB, patch)
2011-07-11 17:41 PDT, Jonathan Dong
jorlow: review-
jorlow: commit-queue-
Details | Formatted Diff | Diff
patch (8.71 KB, patch)
2011-07-13 10:04 PDT, Jonathan Dong
no flags Details | Formatted Diff | Diff
patch (8.74 KB, patch)
2011-07-14 08:34 PDT, Jonathan Dong
webkit.review.bot: commit-queue-
Details | Formatted Diff | Diff
patch (8.74 KB, patch)
2011-07-14 20:41 PDT, Jonathan Dong
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Joshua Bell 2011-04-17 22:58:17 PDT
Repro steps:

1. Launch WebKit
2. Browse to any site (e.g. https://bugs.webkit.org
3. Develop > Show Error Console
4. Enter: localStorage.setItem('foo','123\x00567')
5. Enter: localStorage.getItem('foo').length

Expected: 7
Actual: 7

6. Quit WebKit
7. Launch WebKit
8. Browse back to the site
9. Develop > Show Error Console
10. Enter: localStorage.getItem('foo').length

Expected: 7
Actual: 3
Comment 1 Alexey Proskuryakov 2011-04-18 10:46:46 PDT
Whoa!
Comment 2 Jonathan Dong 2011-06-11 01:02:27 PDT
I studied the behavior in latest version of Opera and Firefox, in Opera it just truncated the string from \x00 and only '123' left, while in Firefox it converted the \x00 into "\0" then the value became "123\0567".
I also checked the spec of Web Storage (http://dev.w3.org/html5/webstorage/), it implies that "Values can be any data type supported by the structured clone algorithm." But actually, it seems the main stream browsers just treated the value as String.
So I suppose what we should do now is just make sure that the string saved in StorageMap equals to the string we saved into the SQLite database. Just follow either Opera way or Firefox way.

Any opinion about that?
Comment 3 Jeremy Orlow 2011-06-13 21:27:36 PDT
(In reply to comment #2)
> I studied the behavior in latest version of Opera and Firefox, in Opera it just truncated the string from \x00 and only '123' left, while in Firefox it converted the \x00 into "\0" then the value became "123\0567".
> I also checked the spec of Web Storage (http://dev.w3.org/html5/webstorage/), it implies that "Values can be any data type supported by the structured clone algorithm." But actually, it seems the main stream browsers just treated the value as String.
> So I suppose what we should do now is just make sure that the string saved in StorageMap equals to the string we saved into the SQLite database. Just follow either Opera way or Firefox way.
> 
> Any opinion about that?

Yeah, at the very least the value should not change when the browser is restarted.  It's likely getting mangled in SQLite, but I'm not entirely sure.  (In IndexedDB, we switched to using a blob type in SQLite to avoid similar problems.)
Comment 4 Jonathan Dong 2011-06-14 08:38:00 PDT
(In reply to comment #3)
> Yeah, at the very least the value should not change when the browser is restarted.  It's likely getting mangled in SQLite, but I'm not entirely sure.  (In IndexedDB, we switched to using a blob type in SQLite to avoid similar problems.)

you are right, I traced the codes, and the string was mangled by SQLite when generate SQL insert statement using sqlite3_bind_text() in SQLiteStatement::bindText(), which was called in StorageAreaSync::sync() as the value string was treated as a string.

So I suppose the best way to handle this bug is to use BLOB instead of TEXT as the value type in ItemTable.
Comment 5 Jonathan Dong 2011-06-14 08:42:00 PDT
Created attachment 97124 [details]
patch

changed the structure of ItemTable, use BLOB instead of TEXT as the value type, and modified the way we get and set value string.
Comment 6 Jeremy Orlow 2011-06-14 09:07:06 PDT
Comment on attachment 97124 [details]
patch

Will this work for existing databases?  My guess is that you'll need to do some sort of migration.  (i.e. check when loading stuff whether it's text, and if so, convert it.)

Also, you'll need a layout test.  Also I believe the patch is backwards (i.e. I see the change log entry being deleted).
Comment 7 Alexey Proskuryakov 2011-06-14 09:13:51 PDT
Do we have a way to test something like this?
Comment 8 Jeremy Orlow 2011-06-14 09:20:00 PDT
(In reply to comment #7)
> Do we have a way to test something like this?

One could add something to the layout test controller.  Though that'd be a lot of effort for little gain.  A manual test would do the trick though.
Comment 9 Jeremy Orlow 2011-06-14 09:50:55 PDT
I doubt it'll be useful, but in case it is, here's the related indexedDB bug: https://bugs.webkit.org/show_bug.cgi?id=52890

Since IndexedDB doesn't "cache" things in memory, the layout test was trivial.
Comment 10 Jonathan Dong 2011-07-05 18:51:08 PDT
Created attachment 99775 [details]
patch

Thanks for all your suggestions. Actually I'm trying to contribute my first patch in webkit. :)

I've created a new patch including the ItemTable migration and a layout test.
Comment 11 WebKit Review Bot 2011-07-05 18:54:50 PDT
Attachment 99775 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Source/WebCore/ChangeLog:11:  Line contains tab character.  [whitespace/tab] [5]
Source/WebCore/ChangeLog:14:  Line contains tab character.  [whitespace/tab] [5]
Source/WebCore/ChangeLog:17:  Line contains tab character.  [whitespace/tab] [5]
Source/WebCore/ChangeLog:18:  Line contains tab character.  [whitespace/tab] [5]
Source/WebCore/ChangeLog:20:  Line contains tab character.  [whitespace/tab] [5]
Source/WebCore/ChangeLog:21:  Line contains tab character.  [whitespace/tab] [5]
Source/WebCore/ChangeLog:23:  Line contains tab character.  [whitespace/tab] [5]
Source/WebCore/ChangeLog:25:  Line contains tab character.  [whitespace/tab] [5]
Source/WebCore/ChangeLog:27:  Line contains tab character.  [whitespace/tab] [5]
Source/WebCore/ChangeLog:28:  Line contains tab character.  [whitespace/tab] [5]
Total errors found: 10 in 6 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 12 Jeremy Orlow 2011-07-06 22:02:27 PDT
Comment on attachment 99775 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=99775&action=review

looking pretty good

> Source/WebCore/ChangeLog:7
> +

Data migrations scare me and people like release engineers and QA...might be worth a bit more of a note with something like DATA MIGRATION in all caps so it can catch the right people's attention if they're skimming the ChangeLog.  :-)

> Source/WebCore/manual-tests/localstorage-value-truncation.html:18
> +        document.write("</p><p>The expected value is: '123\x0056', the expected length is: 6</p>");

You should also be able to write a big pass or fail for whether it seems correct.

> Source/WebCore/storage/StorageAreaSync.cpp:270
> +    SQLiteStatement query(m_database, "SELECT key, value FROM ItemTable");

This will be very heavyweight.  There must be some way to examine just the column.  If not, at least limit it to one row.

> Source/WebCore/storage/StorageAreaSync.cpp:273
> +    query.finalize();

The destructor finalizes, so maybe a bit cleaner to just wrap the block in {}'s so query goes out of scope at this point.

> Source/WebCore/storage/StorageAreaSync.cpp:284
> +    size_t numOfCommands = sizeof(commands) / sizeof(commands[0]);

Does webkit have any macro for this?  I vaguely remember the answer being no, but just double checking.  Alternately, putting a NULL as the last element and looping until you see it might be slightly cleaner.

> Source/WebCore/storage/StorageAreaSync.cpp:286
> +    SQLiteTransaction transaction(m_database, false);

If you're feeling particularly heroic, creating an enum for read only and read write and replacing the second param for SQLiteTransactions would be awesome.  Prob in another patch.  cc me if you decide to do it.

> Source/WebCore/storage/StorageAreaSync.cpp:289
> +        if (!m_database.executeCommand(commands[i])) {

This worries me some...  If this happens, will localStorage not be able to proceed?  I kind of think in such a case we should just drop all the old data (well, save it to some special table name so that it is at least possible to recover it) and move on...

And there should be a NOTREACHED here since someone with it under debugger would probably like to know if we fail.  :-)
Comment 13 Jonathan Dong 2011-07-11 09:55:01 PDT
Created attachment 100315 [details]
patch

Thank you for your help Jeremy, you are so kind and patient, and your suggestions are very useful to me.
I've modified all the problems you've mentioned, please help me to review it.

BTW, I'm interested in creating an enum for the SQLiteTransactions, could you tell me how can we start?
Comment 14 WebKit Review Bot 2011-07-11 09:57:42 PDT
Attachment 100315 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Source/WebCore/storage/StorageAreaSync.cpp:302:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 1 in 6 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 15 Alexey Proskuryakov 2011-07-11 09:58:27 PDT
Comment on attachment 100315 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=100315&action=review

> Source/WebCore/platform/sql/SQLiteStatement.cpp:298
> +    return String("BLOB") == String(reinterpret_cast<const UChar*>(sqlite3_column_decltype16(m_statement, col))).upper();

Please use equalIgnoringCase function here.

> Source/WebCore/platform/sql/SQLiteStatement.h:77
> +    bool isColumnDeclearedAsBlob(int col);

s/Decleared/Declared/.
Comment 16 Jonathan Dong 2011-07-11 17:41:12 PDT
Created attachment 100391 [details]
patch

Thanks Alexey, I've fixed issues that you mentioned.
Comment 17 Jeremy Orlow 2011-07-13 01:32:52 PDT
(In reply to comment #13)
> Created an attachment (id=100315) [details]
> patch
> 
> Thank you for your help Jeremy, you are so kind and patient, and your suggestions are very useful to me.
> I've modified all the problems you've mentioned, please help me to review it.
> 
> BTW, I'm interested in creating an enum for the SQLiteTransactions, could you tell me how can we start?

Simply create it in that class, change the bool param in the method to use your enum, search for all places that call the method, and change them to use the enum instead of the boolean.
Comment 18 Jeremy Orlow 2011-07-13 01:43:54 PDT
Comment on attachment 100391 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=100391&action=review

Should be r+able if this stuff is addressed.  Thanks!

> Source/WebCore/storage/StorageAreaSync.cpp:276
> +        SQLiteStatement query(m_database, "SELECT value FROM ItemTable where rowid = 1");

Hm...will this always work?  What if someone inserts something, inserts another something, then deletes that first something.  Won't then there be items in it but no rowid = 1?  I think you just want s/where rowid = 1/limit 1/.  Right?

> Source/WebCore/storage/StorageAreaSync.cpp:277
> +        if (query.isColumnDeclaredAsBlob(0))

Wait, will this work if there was no data in the table?

> Source/WebCore/storage/StorageAreaSync.cpp:294
> +    while (commands[i]) {

Probably a bit cleaner to do for (size_t i = 0; commands[i]; ++i)

> Source/WebCore/storage/StorageAreaSync.cpp:299
> +            // save the ItemTable for future restore by renaming it.

Mention that this will essentially delete the current database, but that's better than continually hitting this case and never being able to use the local storage.  Mention that if this is ever hit, it's definitely a bug, and that's why we assert not reached below.

Actually, you should probably move the assert not reached to just above this line.
Comment 19 Jonathan Dong 2011-07-13 02:30:01 PDT
Comment on attachment 100391 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=100391&action=review

>> Source/WebCore/storage/StorageAreaSync.cpp:276
>> +        SQLiteStatement query(m_database, "SELECT value FROM ItemTable where rowid = 1");
> 
> Hm...will this always work?  What if someone inserts something, inserts another something, then deletes that first something.  Won't then there be items in it but no rowid = 1?  I think you just want s/where rowid = 1/limit 1/.  Right?

yes your way is more appropriate, thanks.

>> Source/WebCore/storage/StorageAreaSync.cpp:277
>> +        if (query.isColumnDeclaredAsBlob(0))
> 
> Wait, will this work if there was no data in the table?

I've tested on the empty ItemTable and it works. Actually we just prepare() but not really step() the statement in isColumnDeclaredAsBlob() function, as the sqlite3_column_decltype16() do not need to execute the statement.
Comment 20 Jonathan Dong 2011-07-13 10:04:17 PDT
Created attachment 100681 [details]
patch

Thanks Jeremy, I've learned a lot from this patch.
Comment 21 Jeremy Orlow 2011-07-13 13:40:59 PDT
Comment on attachment 100681 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=100681&action=review

A few suggestions to make things more clear.

> Source/WebCore/storage/StorageAreaSync.cpp:276
> +        SQLiteStatement query(m_database, "SELECT value FROM ItemTable LIMIT 1");

Add a comment to note that this query isn't ever executed.

> Source/WebCore/storage/StorageAreaSync.cpp:298
> +            ASSERT_NOT_REACHED();

Maybe put this between the comment and the if statement since the comment partially explains this too.

> Source/WebCore/storage/StorageAreaSync.cpp:300
> +            // finally it will try to save the ItemTable by renaming it, for manually restore in the future.

nit: It's nice to puse proper sentences.

> Source/WebCore/storage/StorageAreaSync.cpp:301
> +            // NOTICE: this will eventually DELETE the current database, but that's better

Did you mean "essentially"?
Comment 22 Jonathan Dong 2011-07-14 08:34:37 PDT
Created attachment 100810 [details]
patch

Thanks a lot, Jeremy.
Comment 23 WebKit Review Bot 2011-07-14 13:01:32 PDT
Comment on attachment 100810 [details]
patch

Rejecting attachment 100810 [details] from commit-queue.

Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=ec2-cq-02', '--port..." exit_code: 1

Last 500 characters of output:
hangeLog does not appear to be a valid reviewer according to committers.py.
ERROR: /mnt/git/webkit-commit-queue/Source/WebCore/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive).
Updating OpenSource
Current branch master is up to date.
Updating chromium port dependencies using gclient...

________ running '/usr/bin/python gyp_webkit' in '/mnt/git/webkit-commit-queue/Source/WebKit/chromium'
Updating webkit projects from gyp files...

Full output: http://queues.webkit.org/results/9061334
Comment 24 Jonathan Dong 2011-07-14 20:41:15 PDT
Created attachment 100923 [details]
patch

Sorry Jeremy, I thought commit bot will add your name into ChangeLog as the reviewer but it seems I have to add it manually. thank you for your patient! :)
Comment 25 Jeremy Orlow 2011-07-15 00:41:39 PDT
Comment on attachment 100923 [details]
patch

rubber stamp
Comment 26 WebKit Review Bot 2011-07-15 01:35:29 PDT
Comment on attachment 100923 [details]
patch

Clearing flags on attachment: 100923

Committed r91060: <http://trac.webkit.org/changeset/91060>
Comment 27 Dongwoo Joshua Im 2011-09-08 02:53:14 PDT
I
Comment 28 Dongwoo Joshua Im 2011-09-08 02:56:21 PDT
I have one question. Why do we need to convert ItemTable to ItemTable2, rather than just create the ItemTable use BLOB value for the first time?
Comment 29 Naveen Bobbili 2011-09-23 06:07:38 PDT
As per sqlite3 API , NULL pointer is returned if we try to get the value for an empty string ("") stored as blob. So storing an empty string as would return NULL string while retrieving.
Test Case:
<html> 
  <head> 
    <script> 
      console.log(localStorage['test1'] + "|"+localStorage['test2']+"|");
      localStorage['test1'] = "hi";
      localStorage['test2'] = "";
    </script> 
  </head> 
  <body>
  </body>
</html>

2. Load in chrome  and on first visit it reads "undefined|undefined" in the console. Reload and you get "hi||".
3. Restart browser and it prints "hi|undefined". But it is supposed to print "hi||" on restarting the browser.

This issue has creeped in once we started storing the value as BLOB and was working well on previous versions of chrome.

For reference: http://code.google.com/p/chromium/issues/detail?id=90879
We can fix it in sqlite3 api to return the valid pointer instead of a NULL pointer when the length of the string is 0. 

But seemed logical to start a discussion here. Please post your replies.
Comment 30 Naveen Bobbili 2011-09-29 06:37:18 PDT
I checked from my end to make an attempt to checkin code changes to Sqlite3 api code in chrome. It is not approved. 
So we might have to make changes in LocalStorage code.
There are 2 approaches that I could think of

1) Since local storage doesn't allow NULL blobs so we surely know that if we get a NULL pointer for sqlite3_column_blob() API it is because we have stored an empty string. Keeping this in perspective we can check if blobisempty and return TRUE when sqlite3_column_blob() return NULL . But this patch has some evident problems. The check would return TRUE for caller who allow storing NULL blobs also. So its not consistent.

2) Add a new column inside local storage TABLE which sets an INTEGER value to 1 when if we are storing EMPTY string and 0 other wise. While retrieving the value from db we can use this column's value. 

Please let me know which is the preferred solution, so that I can put up a patch for the same.
Comment 31 Naveen Bobbili 2011-09-30 01:19:06 PDT
https://bugs.webkit.org/show_bug.cgi?id=69138 
Added to track changes to fix empty string issue.
Comment 32 Eric Seidel (no email) 2012-02-10 10:18:16 PST
Comment on attachment 100681 [details]
patch

Cleared Jeremy Orlow's review+ from obsolete attachment 100681 [details] so that this bug does not appear in http://webkit.org/pending-commit.