Crashes when saving webpage to Web Archive format .webarchive file first noticed in r31370 problem remains in r31388
crash does not occur when saving every webpage. but if crash does occur when saving a particular webpage, it seems to be consistently repeatable with that webpage. example: http://www.time.com/time/politics/article/0,8599,1725514,00.html .html saves OK save as .webarchive causes crash
This doesn't crash for me. Can you please attach a crash log - http://webkit.org/quality/crashlogs.html
I can't repro either... if you can, a crash log is critical to explore this...
Created attachment 20178 [details] Bug 18183- crash log Bug 18183- crash log
Created attachment 20180 [details] Bug 18183- crash log #2 crash log from trying to save webpage as .webarchive file here's the webpage: http://www.time.com/time/politics/article/0,8599,1725514,00.html
I see some haxies in your crashlog, and all such 3rd party extensions are unsupported. Can you try removing them and then reproducing?
Created attachment 20182 [details] crash log #3 here is crash log for same error, this time under 10.5.2, same webpage
Created attachment 20183 [details] bug 18183- 10.5.2 Safe Boot crash log crash when saving same webpage, this time under 10.5.2 with Safe Boot (startup holding "shift" key) (I don't know if this is what you need? hope this helps... thanks)
(In reply to comment #6) > I see some haxies in your crashlog, and all such 3rd party extensions are > unsupported. Can you try removing them and then reproducing? > uh-oh. I did not consciously install any haxies... I don't know where they came from or how to remove them. Is this what I am trying to remove?: com.unsanity.smartcrashreports com.unsanity.menuextraenabler 1.0.3 yuck I feel like my computer has been infected
Created attachment 20186 [details] Bug 18183- no unsanity.txt removed unsanity software
Created attachment 20187 [details] Bug 18183- no default folder.txt removed Default Folder X
Created attachment 20188 [details] another crash log another crash log this bug is very repeatable for me
The crash occurs when saving webpages that contain items that could not be loaded. On my computer, some domains are filtered out. (in my case, these domain names point to localhost rather than their correct host IP address). If the domains are unfiltered, allowed to resolve naturally without blocking the IP, and all items are loaded, then the page can be saved to .webarchive without crashing. I have not tested what happens if the domain name or IP address is merely blocked, or if any item fails to load for any other reason. So far it seems that this is consistent to explain which pages save correctly and which pages cause a crash when saving as .webarchive file I hope this should help to pinpoint the nature of the bug...
for example; http://www.time.com/time/politics/article/0,8599,1725514,00.html looking at Activity window, we see the page loads items from other domains, such as: ad.doubleclick.net ad.insightexpressai.com an.tacoda.net ar.atwola.com bin.clearspring.com cdn1.sphere.com and so on
Created attachment 20191 [details] Bug 18183- screen capture screen capture of Activity window shows items that are not loaded because "cannot connect to host" for certain domains. this behavior is intentional, and expected to occur based on the custom configuration by user. however, this condition seems to be the cause of crashing when attempting to save the webpage as .webarchive file.
Created attachment 20202 [details] hosts file (modified) to block unwanted domains from loading, modify the hosts file as shown in the attachment the hosts file is located at /private/etc/hosts on your Mac note the filename is "hosts" and not "hosts.txt" add a line to the hosts file such as "127.0.0.1 www.someunwanteddomainnamehere.com" you can add as many such lines as you wish this blocks the unwanted domain by resolving to localhost instead of looking up the domain name in DNS. HTH
Modifying my hosts file still doesn't reproduce this crash - I wonder if this is Tiger-only?
(In reply to comment #17) > Modifying my hosts file still doesn't reproduce this crash - I wonder if this > is Tiger-only? I wondered about it, but Leopard crashes the same as Tiger (for me) Leopard crash log: http://bugs.webkit.org/attachment.cgi?id=20183 (posted previously)
(In reply to comment #13) > I have not tested what happens if the domain name or IP address is merely > blocked, or if any item fails to load for any other reason. I tried saving some web pages that contained items that were not loaded (because they were firewalled normally), and those pages saved OK. Thus, I posted the hosts file info in case that might be helpful, thinking that the hosts blocking technique might be related to the crashing.
What other web pages does this crash on for you? Does it happen on something as simple as Google?
(In reply to comment #20) > What other web pages does this crash on for you? Does it happen on something > as simple as Google? > Also, when you say some things were "firewalled normally" what exactly do you mean? Are you going thru a proxy that blocks certain things (like ads) or how extensive is your hosts file? Does this still happen when you have a straight thru, unfiltered connection to these sites?
(In reply to comment #20) > What other web pages does this crash on for you? Does it happen on something > as simple as Google? no, Google saves OK. The Acid 3 page saves OK, also. Drudge Report (a simple page) with no blocking saves OK. but Drudge Report with the adgardener.com domains blocked in hosts crashes. The www.time.com page above behaves the same way. So far, every page without hosts blocking saves OK. Every page that has crashed had items from domains that were blocked in hosts. I found a page that has hosts blocking but saves OK, though: http://www.cnn.com/2008/US/03/30/dith.pran.obit.ap/index.html with servedby.advertising.com and view.atmdt.com blocked by hosts saves OK. These blocked items are shown in Activity folded under outline triangles, I don't know if that means anything or makes a difference.
(In reply to comment #21) If some item(s) is not loaded (can't connect, or whatever) due to "natural causes", the page saves OK. If some item(s) is not loaded due to hosts blocking, saving the page to .webarchive format crashes (almost always). If some item(s) is not loaded due to firewall blocking (tried Little Snitch to block specific domains), the page saves OK. My hosts file lists approx 50 domains to block. But the results were the same when I tested it with just one domain, in the attempt to test the case which I hoped you and others might be able to repeat.
Sorry I haven't responded to this one in a few days. While I still can't reproduce under any circumstances, I finally got a chance to look at the code - and it's a simple null dereference.
Created attachment 20227 [details] Proposed fix (no layout test...) Attached the obvious fix - but since I can't repro, I don't know how to make a layout test for this...
(In reply to comment #25) > Attached the obvious fix - but since I can't repro, I don't know how to make a > layout test for this... thanks! is it possible to download a build containing this fix? i would be happy to give it a try.
(In reply to comment #26) > (In reply to comment #25) > > Attached the obvious fix - but since I can't repro, I don't know how to make a > > layout test for this... > > thanks! > > is it possible to download a build containing this fix? i would be happy to > give it a try. > Not yet - the patch should be reviewed and landed today and will hopefully appear in a nightly soon.
Comment on attachment 20227 [details] Proposed fix (no layout test...) r=me
Landed in r31467
tested r31535 the crashing problem is gone (as expected). thanks! but blocked items are not appearing in the Activity window. (a small percentage of blocked items are shown, however). also, the Status bar is underreporting errors, because blocked items that do not appear in the Activity window are not reported as errors. a few examples: on www.drudgereport.com: everything blocked other than drudgereport.com and d.yimg.com harvest.adgardener.com items appear in Activity correctly ("can't connect to host") and is reported correctly as an error in Status Bar. other blocked domains do not appear in Activity and are not counted as errors. on www.cnn.com: mostly everything blocked, www.cnn.com, i.cdn.turner.com, i.l.cnn.net not blocked. metrics.cnn.com is blocked- it is the only blocked domain that appears in Activity window and counted as error. http://www.time.com/time/politics/article/0,8599,1725514,00.html : everything blocked except www.time.com and img.timeinc.net no blocked items appear in Activity window, no errors
Created attachment 20274 [details] hosts.txt here is sample hosts file contains domains blocked for examples described in #30 above.
You are now describing a completely different issue. If you could, please write up a new bug with the new issue. Also, please make sure you compare the behavior of shipping Safari 3.1 to the latest nightly to see if they differ. Thanks!
(In reply to comment #32) > If you could, please > write up a new bug with the new issue. OK, will do... Thanks! (In reply to comment #32) > Also, please make sure you compare the behavior of shipping Safari 3.1 to the > latest nightly to see if they differ. Yes, Safari 3.1 reports the Activity ("can't connect to host") correctly.
> write up a new bug with the new issue. added as bug #18267