Bug 75877

Summary: Dynamically changing favicons may cause memory limit
Product: WebKit Reporter: Rafael Brandao <rafael.lobo>
Component: New BugsAssignee: Rafael Brandao <rafael.lobo>
Status: NEW ---    
Severity: Normal CC: abarth, ap, beidson, cmarcelo, fishd, groby, kling, shonheldin
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   

Description Rafael Brandao 2012-01-09 11:46:20 PST
Document::addIconURL keeps adding the new icon URL everytime the favicon is changed on DOM, but this may overflow at some point. We should either keep a map of this (to avoid adding the same url) or just discard previous icon url references that may not be used anymore (this sounds better).

The url http://www.p01.org/releases/DEFENDER_of_the_favicon/ can show this bug. It runs a script that runs a game and paints the frame as a favicon. Before the game begins you can see a small loop... but DOM keeps adding the same urls on a Vector.
Comment 1 Brady Eidson 2012-01-09 14:17:34 PST
First off, I can't replicate the reported bug in Safari.  I've tried watching WebProcess memory usage.  Since we quite purposefully don't allow dynamic changes/animations to the favicon the game won't play in the icon and I can't tell if the game is actually "working"

But as far as hitting some memory limit by dynamically changing the icon...  Documents can - by design in the engine and in HTML5 - have multiple icon URLs.  That's why we add new icon urls instead of replacing the previous one.

It's very true that we're kind of dumb about it, always adding a new icon to the list even if the mimeType, sizes, and iconType are all the same as a previous entry.  But changing that to be a hash based on the mime/icon/size tuple would not eliminate this theoretical bug as a page could start dynamically adding icons with unique identifying characteristics that could eat up all memory.  This is really no different than running specially crafted javascript in a million other ways designed to eat up all memory and I'm not sure we have a solution for it in the works.

I'll look more at this if someone provides a test case that obviously reproduces in WebKit and that does *not* reproduce in another HTML5 browser.
Comment 2 Rafael Brandao 2012-01-25 13:20:13 PST
(In reply to comment #1)
> First off, I can't replicate the reported bug in Safari.  I've tried watching WebProcess memory usage.  Since we quite purposefully don't allow dynamic changes/animations to the favicon the game won't play in the icon and I can't tell if the game is actually "working"
> 
> But as far as hitting some memory limit by dynamically changing the icon...  Documents can - by design in the engine and in HTML5 - have multiple icon URLs.  That's why we add new icon urls instead of replacing the previous one.
> 
> It's very true that we're kind of dumb about it, always adding a new icon to the list even if the mimeType, sizes, and iconType are all the same as a previous entry.  But changing that to be a hash based on the mime/icon/size tuple would not eliminate this theoretical bug as a page could start dynamically adding icons with unique identifying characteristics that could eat up all memory.  This is really no different than running specially crafted javascript in a million other ways designed to eat up all memory and I'm not sure we have a solution for it in the works.
> 

Sorry for the delay. I agree that we can run javascript in a way that we would end up eating all memory, but I don't think this is the case (at least it shouldn't). The code on that page does the following:

  var icon=$('favicon');
  (newIcon = icon.cloneNode(true)).setAttribute('href',ctx.canvas.toDataURL());
  icon.parentNode.replaceChild(newIcon,icon);


So it is not actually adding any favicon to the page, it is replacing it. Yet we handle this as a simple addition, this is why it has been bothering me. I was wondering if we couldn't make use of HTMLLinkElement destructor and get rid of the icon we've added.

> I'll look more at this if someone provides a test case that obviously reproduces in WebKit and that does *not* reproduce in another HTML5 browser.

I'm thinking if there's such case, but even if we can't find one or if we do find and also reproduce in other browsers, does this mean this should be the correct behavior? It still looks wrong to me.
Comment 3 Brady Eidson 2012-01-25 13:41:07 PST
(In reply to comment #2)
> (In reply to comment #1)
> > First off, I can't replicate the reported bug in Safari.  I've tried watching WebProcess memory usage.  Since we quite purposefully don't allow dynamic changes/animations to the favicon the game won't play in the icon and I can't tell if the game is actually "working"
> > 
> > But as far as hitting some memory limit by dynamically changing the icon...  Documents can - by design in the engine and in HTML5 - have multiple icon URLs.  That's why we add new icon urls instead of replacing the previous one.
> > 
> > It's very true that we're kind of dumb about it, always adding a new icon to the list even if the mimeType, sizes, and iconType are all the same as a previous entry.  But changing that to be a hash based on the mime/icon/size tuple would not eliminate this theoretical bug as a page could start dynamically adding icons with unique identifying characteristics that could eat up all memory.  This is really no different than running specially crafted javascript in a million other ways designed to eat up all memory and I'm not sure we have a solution for it in the works.
> > 
> 
> Sorry for the delay. I agree that we can run javascript in a way that we would end up eating all memory, but I don't think this is the case (at least it shouldn't). The code on that page does the following:
> 
>   var icon=$('favicon');
>   (newIcon = icon.cloneNode(true)).setAttribute('href',ctx.canvas.toDataURL());
>   icon.parentNode.replaceChild(newIcon,icon);
> 
> 
> So it is not actually adding any favicon to the page, it is replacing it. Yet we handle this as a simple addition, this is why it has been bothering me. I was wondering if we couldn't make use of HTMLLinkElement destructor and get rid of the icon we've added.

*THIS* code replaces it, yes.  I was speaking of theoretical, specially crafted code that adds new ones over and over.

> 
> > I'll look more at this if someone provides a test case that obviously reproduces in WebKit and that does *not* reproduce in another HTML5 browser.
> 
> I'm thinking if there's such case, but even if we can't find one or if we do find and also reproduce in other browsers, does this mean this should be the correct behavior? It still looks wrong to me.

This bug has the title "Dynamically changing favicons may cause memory limit".  What I'm saying is that, yes, that is true.  Since we're supporting multiple <link> tags referring to multiple icons, we have to accept that a specially crafted page can cause us to go out of memory.
Comment 4 Brady Eidson 2012-01-25 13:47:09 PST
(In reply to comment #3)
> (In reply to comment #2)
> > So it is not actually adding any favicon to the page, it is replacing it. Yet we handle this as a simple addition, this is why it has been bothering me. I was wondering if we couldn't make use of HTMLLinkElement destructor and get rid of the icon we've added.
> 
> *THIS* code replaces it, yes.  I was speaking of theoretical, specially crafted code that adds new ones over and over.

And what I'm *really* saying by stressing this point over and over is that while we could fix this bug for this one particular "innocent" page, this bug probably isn't super high of a priority since we can't fix it for malicious pages.

---

On a different note, this whole problem was recently introduced by http://trac.webkit.org/changeset/95593 to resolve https://bugs.webkit.org/show_bug.cgi?id=65564.

That patch probably should've been more careful to not introduce this bug.

I can also spot what I think is a behavorial bug in that patch:
IconURL newURL(KURL(ParsedURLString, url), sizes, mimeType, iconType); 
...
if (Frame* f = frame()) { 
    IconURL iconURL = f->loader()->icon()->iconURL(iconType); 
    if (iconURL == newURL) 
    f->loader()->didChangeIcons(iconType); 
}

Why does it notify "didChangeIcons" if the URLs did *NOT* change?  That's bizarre.

But that's separate.  CC'ing the patches author and reviewer here.
Comment 5 Rafael Brandao 2012-01-25 13:52:49 PST
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > So it is not actually adding any favicon to the page, it is replacing it. Yet we handle this as a simple addition, this is why it has been bothering me. I was wondering if we couldn't make use of HTMLLinkElement destructor and get rid of the icon we've added.
> > 
> > *THIS* code replaces it, yes.  I was speaking of theoretical, specially crafted code that adds new ones over and over.
> 
> And what I'm *really* saying by stressing this point over and over is that while we could fix this bug for this one particular "innocent" page, this bug probably isn't super high of a priority since we can't fix it for malicious pages.
> 

Oh, I see. Thanks for the input.

> ---
> 
> On a different note, this whole problem was recently introduced by http://trac.webkit.org/changeset/95593 to resolve https://bugs.webkit.org/show_bug.cgi?id=65564.
> 
> That patch probably should've been more careful to not introduce this bug.
> 
> I can also spot what I think is a behavorial bug in that patch:
> IconURL newURL(KURL(ParsedURLString, url), sizes, mimeType, iconType); 
> ...
> if (Frame* f = frame()) { 
>     IconURL iconURL = f->loader()->icon()->iconURL(iconType); 
>     if (iconURL == newURL) 
>     f->loader()->didChangeIcons(iconType); 
> }
> 
> Why does it notify "didChangeIcons" if the URLs did *NOT* change?  That's bizarre.
> 
> But that's separate.  CC'ing the patches author and reviewer here.

If I remember correctly, I believe it retrieves the iconURL because we could have resolved another url. So he adds the icon url, asks if that is the one we are using, and then if it is indeed that one, it triggers the icon change.
Comment 6 Rafael Brandao 2012-01-25 13:56:40 PST
Not sure if I was clear though... Let me try again, we add the icon url, then asks if with given the current options, the new url is the chosen one. If so, then we can let the client know that the icon has been changed. What I don't remember is if we check if this "new url", besides being just added, was not there before, so we could have chosen that one already before adding the new favicon... and then it wouldn't really be an icon change.
Comment 7 Rachel Blum 2012-01-25 14:26:29 PST
rafael.lobo is correct - this is not a bug, but simply a check if the favicon now resolves to the new URL (i.e. has changed). My apologies for the confusion caused.

Not checking if the URL is already there is indeed an oversight. It was not an issue before since we were limited to 3 icons total.

We can't simply hash based on name/mime/size there, though - it is technically possible you'd be replacing the favicon with a different one with a changed mime type. Instead, we need to remove old favicons if the associated <link> tag is removed.
Comment 8 Brady Eidson 2012-01-25 14:37:42 PST
(In reply to comment #7)
> rafael.lobo is correct - this is not a bug, but simply a check if the favicon now resolves to the new URL (i.e. has changed). My apologies for the confusion caused.

Thanks for clearing this up.

> Not checking if the URL is already there is indeed an oversight. It was not an issue before since we were limited to 3 icons total.

It's not so simple as allowing each unique URL only once, though, as the same URL could be used in multiple link tags.

> We can't simply hash based on name/mime/size there, though - it is technically possible you'd be replacing the favicon with a different one with a changed mime type. Instead, we need to remove old favicons if the associated <link> tag is removed.

I agree that removing from the set on <link> detachment is a good way forward.

But we still need to make this a hash.  We don't like working in Vectors when we know the potential data set is unbounded *and* lookups are necessary.
Comment 9 Rafael Brandao 2012-01-25 14:48:03 PST
(In reply to comment #8)
> (In reply to comment #7)
> > rafael.lobo is correct - this is not a bug, but simply a check if the favicon now resolves to the new URL (i.e. has changed). My apologies for the confusion caused.
> 
> Thanks for clearing this up.
> 
> > Not checking if the URL is already there is indeed an oversight. It was not an issue before since we were limited to 3 icons total.
> 
> It's not so simple as allowing each unique URL only once, though, as the same URL could be used in multiple link tags.

Yes, but we could check the previous resolved icon url, then add the new url, and then check the resolved url again. If they are equal, then no icon has changed (but we did add a new favicon on the page).

> 
> > We can't simply hash based on name/mime/size there, though - it is technically possible you'd be replacing the favicon with a different one with a changed mime type. Instead, we need to remove old favicons if the associated <link> tag is removed.
> 
> I agree that removing from the set on <link> detachment is a good way forward.
> 
> But we still need to make this a hash.  We don't like working in Vectors when we know the potential data set is unbounded *and* lookups are necessary.

I think the only limitation to change that Vector is the order of those favicons... the order of them is important for the lookups (as the specs suggests).

A set would raise a problem in that order, and cause some others: imagine we have two different favicons, both pointing to the same url. Then we change one of them to another url. We should have now two different urls for favicon for that particular page, but I believe if we use a set, we would end up with only one.
Comment 10 Brady Eidson 2012-01-25 15:01:07 PST
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)

> > > We can't simply hash based on name/mime/size there, though - it is technically possible you'd be replacing the favicon with a different one with a changed mime type. Instead, we need to remove old favicons if the associated <link> tag is removed.
> > 
> > I agree that removing from the set on <link> detachment is a good way forward.
> > 
> > But we still need to make this a hash.  We don't like working in Vectors when we know the potential data set is unbounded *and* lookups are necessary.
> 
> I think the only limitation to change that Vector is the order of those favicons... the order of them is important for the lookups (as the specs suggests).

Ugh, I haven't read this section of the spec in awhile.  I'll look at it and see precisely what guarantees have to be made.

> A set would raise a problem in that order, and cause some others: imagine we have two different favicons, both pointing to the same url. Then we change one of them to another url. We should have now two different urls for favicon for that particular page, but I believe if we use a set, we would end up with only one.

I should have specified that I meant a HashMAP, not a HashSET.

The ordering problem is annoying, though there's ways around it...
Comment 11 Rachel Blum 2012-01-25 15:07:40 PST
(In reply to comment #10)
> (In reply to comment #9)

> > I think the only limitation to change that Vector is the order of those favicons... the order of them is important for the lookups (as the specs suggests).
> 
> Ugh, I haven't read this section of the spec in awhile.  I'll look at it and see precisely what guarantees have to be made.

"If there are multiple equally appropriate icons, user agents must use the last one declared in tree order at the time that the user agent collected the list of icons"

Let's just say <link rel="icon"> does not generate happy campers.
Comment 12 Rafael Brandao 2012-01-25 15:11:13 PST
The part of the spec I was talking about is this one:

"(...) the user agent must select the most appropriate icon according to the type, media, and sizes attributes. If there are multiple equally appropriate icons, user agents must use the last one declared in tree order. If the user agent tries to use an icon but that icon is determined, upon closer examination, to in fact be inappropriate (e.g. because it uses an unsupported format), then the user agent must try the next-most-appropriate icon as determined by the attributes." (link: http://www.w3.org/TR/html5/links.html#rel-icon)

I'll report another bug so we can fix the false icon url changed signal we've found out here.
Comment 13 Ilya Pishchulin 2021-12-29 01:58:23 PST
Any news? Problem still relevant.