WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
18183
Crashes when saving webpage to Web Archive format .webarchive file
https://bugs.webkit.org/show_bug.cgi?id=18183
Summary
Crashes when saving webpage to Web Archive format .webarchive file
nobody
Reported
2008-03-28 12:08:49 PDT
Crashes when saving webpage to Web Archive format .webarchive file first noticed in
r31370
problem remains in
r31388
Attachments
Bug 18183- crash log
(63.22 KB, text/plain)
2008-03-28 18:34 PDT
,
nobody
no flags
Details
Bug 18183- crash log #2
(70.65 KB, text/plain)
2008-03-28 20:09 PDT
,
nobody
no flags
Details
crash log #3
(33.38 KB, text/plain)
2008-03-28 21:57 PDT
,
nobody
no flags
Details
bug 18183- 10.5.2 Safe Boot crash log
(34.29 KB, text/plain)
2008-03-28 22:10 PDT
,
nobody
no flags
Details
Bug 18183- no unsanity.txt
(58.34 KB, text/plain)
2008-03-28 23:12 PDT
,
nobody
no flags
Details
Bug 18183- no default folder.txt
(56.10 KB, text/plain)
2008-03-28 23:14 PDT
,
nobody
no flags
Details
another crash log
(50.60 KB, text/plain)
2008-03-28 23:15 PDT
,
nobody
no flags
Details
Bug 18183- screen capture
(404.80 KB, image/png)
2008-03-29 07:44 PDT
,
nobody
no flags
Details
hosts file (modified)
(490 bytes, text/plain)
2008-03-29 17:55 PDT
,
nobody
no flags
Details
Proposed fix (no layout test...)
(1.21 KB, patch)
2008-03-30 17:09 PDT
,
Brady Eidson
mitz: review+
Details
Formatted Diff
Diff
hosts.txt
(4.48 KB, text/plain)
2008-04-01 15:48 PDT
,
nobody
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
nobody
Comment 1
2008-03-28 14:28:53 PDT
crash does not occur when saving every webpage. but if crash does occur when saving a particular webpage, it seems to be consistently repeatable with that webpage. example:
http://www.time.com/time/politics/article/0,8599,1725514,00.html
.html saves OK save as .webarchive causes crash
Matt Lilek
Comment 2
2008-03-28 16:17:13 PDT
This doesn't crash for me. Can you please attach a crash log -
http://webkit.org/quality/crashlogs.html
Brady Eidson
Comment 3
2008-03-28 17:00:20 PDT
I can't repro either... if you can, a crash log is critical to explore this...
nobody
Comment 4
2008-03-28 18:34:22 PDT
Created
attachment 20178
[details]
Bug 18183
- crash log
Bug 18183
- crash log
nobody
Comment 5
2008-03-28 20:09:09 PDT
Created
attachment 20180
[details]
Bug 18183
- crash log #2 crash log from trying to save webpage as .webarchive file here's the webpage:
http://www.time.com/time/politics/article/0,8599,1725514,00.html
Brady Eidson
Comment 6
2008-03-28 21:13:34 PDT
I see some haxies in your crashlog, and all such 3rd party extensions are unsupported. Can you try removing them and then reproducing?
nobody
Comment 7
2008-03-28 21:57:32 PDT
Created
attachment 20182
[details]
crash log #3 here is crash log for same error, this time under 10.5.2, same webpage
nobody
Comment 8
2008-03-28 22:10:03 PDT
Created
attachment 20183
[details]
bug 18183
- 10.5.2 Safe Boot crash log crash when saving same webpage, this time under 10.5.2 with Safe Boot (startup holding "shift" key) (I don't know if this is what you need? hope this helps... thanks)
nobody
Comment 9
2008-03-28 22:35:26 PDT
(In reply to
comment #6
)
> I see some haxies in your crashlog, and all such 3rd party extensions are > unsupported. Can you try removing them and then reproducing? >
uh-oh. I did not consciously install any haxies... I don't know where they came from or how to remove them. Is this what I am trying to remove?: com.unsanity.smartcrashreports com.unsanity.menuextraenabler 1.0.3 yuck I feel like my computer has been infected
nobody
Comment 10
2008-03-28 23:12:54 PDT
Created
attachment 20186
[details]
Bug 18183
- no unsanity.txt removed unsanity software
nobody
Comment 11
2008-03-28 23:14:01 PDT
Created
attachment 20187
[details]
Bug 18183
- no default folder.txt removed Default Folder X
nobody
Comment 12
2008-03-28 23:15:02 PDT
Created
attachment 20188
[details]
another crash log another crash log this bug is very repeatable for me
nobody
Comment 13
2008-03-28 23:41:01 PDT
The crash occurs when saving webpages that contain items that could not be loaded. On my computer, some domains are filtered out. (in my case, these domain names point to localhost rather than their correct host IP address). If the domains are unfiltered, allowed to resolve naturally without blocking the IP, and all items are loaded, then the page can be saved to .webarchive without crashing. I have not tested what happens if the domain name or IP address is merely blocked, or if any item fails to load for any other reason. So far it seems that this is consistent to explain which pages save correctly and which pages cause a crash when saving as .webarchive file I hope this should help to pinpoint the nature of the bug...
nobody
Comment 14
2008-03-29 01:11:12 PDT
for example;
http://www.time.com/time/politics/article/0,8599,1725514,00.html
looking at Activity window, we see the page loads items from other domains, such as: ad.doubleclick.net ad.insightexpressai.com an.tacoda.net ar.atwola.com bin.clearspring.com cdn1.sphere.com and so on
nobody
Comment 15
2008-03-29 07:44:25 PDT
Created
attachment 20191
[details]
Bug 18183
- screen capture screen capture of Activity window shows items that are not loaded because "cannot connect to host" for certain domains. this behavior is intentional, and expected to occur based on the custom configuration by user. however, this condition seems to be the cause of crashing when attempting to save the webpage as .webarchive file.
nobody
Comment 16
2008-03-29 17:55:56 PDT
Created
attachment 20202
[details]
hosts file (modified) to block unwanted domains from loading, modify the hosts file as shown in the attachment the hosts file is located at /private/etc/hosts on your Mac note the filename is "hosts" and not "hosts.txt" add a line to the hosts file such as "127.0.0.1 www.someunwanteddomainnamehere.com" you can add as many such lines as you wish this blocks the unwanted domain by resolving to localhost instead of looking up the domain name in DNS. HTH
Matt Lilek
Comment 17
2008-03-29 18:30:18 PDT
Modifying my hosts file still doesn't reproduce this crash - I wonder if this is Tiger-only?
nobody
Comment 18
2008-03-30 00:30:57 PDT
(In reply to
comment #17
)
> Modifying my hosts file still doesn't reproduce this crash - I wonder if this > is Tiger-only?
I wondered about it, but Leopard crashes the same as Tiger (for me) Leopard crash log:
http://bugs.webkit.org/attachment.cgi?id=20183
(posted previously)
nobody
Comment 19
2008-03-30 10:05:52 PDT
(In reply to
comment #13
)
> I have not tested what happens if the domain name or IP address is merely > blocked, or if any item fails to load for any other reason.
I tried saving some web pages that contained items that were not loaded (because they were firewalled normally), and those pages saved OK. Thus, I posted the hosts file info in case that might be helpful, thinking that the hosts blocking technique might be related to the crashing.
Matt Lilek
Comment 20
2008-03-30 10:37:08 PDT
What other web pages does this crash on for you? Does it happen on something as simple as Google?
Matt Lilek
Comment 21
2008-03-30 10:39:28 PDT
(In reply to
comment #20
)
> What other web pages does this crash on for you? Does it happen on something > as simple as Google? >
Also, when you say some things were "firewalled normally" what exactly do you mean? Are you going thru a proxy that blocks certain things (like ads) or how extensive is your hosts file? Does this still happen when you have a straight thru, unfiltered connection to these sites?
nobody
Comment 22
2008-03-30 12:29:32 PDT
(In reply to
comment #20
)
> What other web pages does this crash on for you? Does it happen on something > as simple as Google?
no, Google saves OK. The Acid 3 page saves OK, also. Drudge Report (a simple page) with no blocking saves OK. but Drudge Report with the adgardener.com domains blocked in hosts crashes. The www.time.com page above behaves the same way. So far, every page without hosts blocking saves OK. Every page that has crashed had items from domains that were blocked in hosts. I found a page that has hosts blocking but saves OK, though:
http://www.cnn.com/2008/US/03/30/dith.pran.obit.ap/index.html
with servedby.advertising.com and view.atmdt.com blocked by hosts saves OK. These blocked items are shown in Activity folded under outline triangles, I don't know if that means anything or makes a difference.
nobody
Comment 23
2008-03-30 12:45:00 PDT
(In reply to
comment #21
) If some item(s) is not loaded (can't connect, or whatever) due to "natural causes", the page saves OK. If some item(s) is not loaded due to hosts blocking, saving the page to .webarchive format crashes (almost always). If some item(s) is not loaded due to firewall blocking (tried Little Snitch to block specific domains), the page saves OK. My hosts file lists approx 50 domains to block. But the results were the same when I tested it with just one domain, in the attempt to test the case which I hoped you and others might be able to repeat.
Brady Eidson
Comment 24
2008-03-30 17:06:58 PDT
Sorry I haven't responded to this one in a few days. While I still can't reproduce under any circumstances, I finally got a chance to look at the code - and it's a simple null dereference.
Brady Eidson
Comment 25
2008-03-30 17:09:42 PDT
Created
attachment 20227
[details]
Proposed fix (no layout test...) Attached the obvious fix - but since I can't repro, I don't know how to make a layout test for this...
nobody
Comment 26
2008-03-31 07:20:52 PDT
(In reply to
comment #25
)
> Attached the obvious fix - but since I can't repro, I don't know how to make a > layout test for this...
thanks! is it possible to download a build containing this fix? i would be happy to give it a try.
Matt Lilek
Comment 27
2008-03-31 07:25:56 PDT
(In reply to
comment #26
)
> (In reply to
comment #25
) > > Attached the obvious fix - but since I can't repro, I don't know how to make a > > layout test for this... > > thanks! > > is it possible to download a build containing this fix? i would be happy to > give it a try. >
Not yet - the patch should be reviewed and landed today and will hopefully appear in a nightly soon.
mitz
Comment 28
2008-03-31 10:02:41 PDT
Comment on
attachment 20227
[details]
Proposed fix (no layout test...) r=me
Brady Eidson
Comment 29
2008-03-31 10:07:20 PDT
Landed in
r31467
nobody
Comment 30
2008-04-01 15:44:33 PDT
tested
r31535
the crashing problem is gone (as expected). thanks! but blocked items are not appearing in the Activity window. (a small percentage of blocked items are shown, however). also, the Status bar is underreporting errors, because blocked items that do not appear in the Activity window are not reported as errors. a few examples: on www.drudgereport.com: everything blocked other than drudgereport.com and d.yimg.com harvest.adgardener.com items appear in Activity correctly ("can't connect to host") and is reported correctly as an error in Status Bar. other blocked domains do not appear in Activity and are not counted as errors. on www.cnn.com: mostly everything blocked, www.cnn.com, i.cdn.turner.com, i.l.cnn.net not blocked. metrics.cnn.com is blocked- it is the only blocked domain that appears in Activity window and counted as error.
http://www.time.com/time/politics/article/0,8599,1725514,00.html
: everything blocked except www.time.com and img.timeinc.net no blocked items appear in Activity window, no errors
nobody
Comment 31
2008-04-01 15:48:53 PDT
Created
attachment 20274
[details]
hosts.txt here is sample hosts file contains domains blocked for examples described in #30 above.
Brady Eidson
Comment 32
2008-04-01 17:14:14 PDT
You are now describing a completely different issue. If you could, please write up a new bug with the new issue. Also, please make sure you compare the behavior of shipping Safari 3.1 to the latest nightly to see if they differ. Thanks!
nobody
Comment 33
2008-04-01 17:20:47 PDT
(In reply to
comment #32
)
> If you could, please > write up a new bug with the new issue.
OK, will do... Thanks! (In reply to
comment #32
)
> Also, please make sure you compare the behavior of shipping Safari 3.1 to the > latest nightly to see if they differ.
Yes, Safari 3.1 reports the Activity ("can't connect to host") correctly.
nobody
Comment 34
2008-04-01 18:11:35 PDT
> write up a new bug with the new issue.
added as
bug #18267
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug