WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
107603
Call XSSAuditor.filterToken() from threaded HTML parser
https://bugs.webkit.org/show_bug.cgi?id=107603
Summary
Call XSSAuditor.filterToken() from threaded HTML parser
Tony Gentilcore
Reported
2013-01-22 17:29:57 PST
Call XSSAuditor.filterToken() from threaded HTML parser
Attachments
Patch
(16.86 KB, patch)
2013-02-06 15:02 PST
,
Tony Gentilcore
no flags
Details
Formatted Diff
Diff
Patch
(21.56 KB, patch)
2013-02-06 16:48 PST
,
Tony Gentilcore
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Eric Seidel (no email)
Comment 1
2013-01-22 18:14:51 PST
omg.
Adam Barth
Comment 2
2013-01-22 18:27:14 PST
Using the atomic token probably isn't going to work. The XSS filter needs to be able to call sourceForToken, which isn't going to work for atomic tokens. I think that means we're going to need to run the XSS filter on the background thread.
Adam Barth
Comment 3
2013-01-22 18:29:11 PST
We should be able to deal with the HTMLNames dependency.
Tony Gentilcore
Comment 4
2013-01-23 11:12:26 PST
This is going to be pretty tricky. Adam and I discussed it offline and came up with the following sketch: - HTMLDocumentParser creates the XSSAuditor on the main thread - HTMLDocumentParser calls XSSAuditor::init() on the main thread - Transfer ownership of the XSSAuditor to the parser thread when the parser thread is created - Call filterToken() on the parser thread - XSSAuditor's actions in filterToken()'s didBlockScript block are queued up in the token steam and if the HTMLDocumentParser uses the token, it executes the action on the main thread. The tricky part is sharing string data, particularly for POST data which may be large. One option is to parse on the main thread for POST data or just for large POST data. The other question is whether we can break this into multiple patches so it isn't one monolithic switch.
Eric Seidel (no email)
Comment 5
2013-01-23 11:22:02 PST
I think we should consider splitting the filterToken bits out into a separate object which does not depend on document, etc. Or maybe the document, etc. parts are in an XSSFilterDelegate object. In either case, I think we should make it easy to tell what parts are OK on the background thread vs. not. Obviously all of XSSFilter isn't, and I don't think per-function separation is going to work as well as whole-object separation would.
Tony Gentilcore
Comment 6
2013-01-31 17:57:33 PST
After the blocking patches, there are 3 non-thread safe dependencies left: 1. sourceForToken()/Document's TextResourceDecoder 2. Tokenizer's shouldAllowCDATA() 3. Documents's url() #3 is easy, but I'm trying to figure out the cleanest way to handle #1 and #2. Any thoughts?
Adam Barth
Comment 7
2013-02-01 00:28:48 PST
(In reply to
comment #6
)
> After the blocking patches, there are 3 non-thread safe dependencies left: > 1. sourceForToken()/Document's TextResourceDecoder
^^^ We should be able to get sourceForToken from the BackgroundHTMLParser. As for TextResourceDecoder, we can get the name of the decoder and create a new one on the background thread.
> 2. Tokenizer's shouldAllowCDATA()
^^^ This will also be available from the BackgroundHTMLParser's tokenizer.
> 3. Documents's url()
^^^ As you say, this one is easy. :)
Tony Gentilcore
Comment 8
2013-02-06 15:02:04 PST
Created
attachment 186929
[details]
Patch
Adam Barth
Comment 9
2013-02-06 15:56:13 PST
Comment on
attachment 186929
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=186929&action=review
> Source/WebCore/html/parser/BackgroundHTMLParser.cpp:165 > + {
Why do we need these { } brackets?
> Source/WebCore/html/parser/HTMLDocumentParser.cpp:531 > + ASSERT(decoder->hasOneRef());
Is the decoder itself safe to send to another thread? Perhaps we should add a isSafeToSendToAnotherThread function to it.
Adam Barth
Comment 10
2013-02-06 15:57:38 PST
Comment on
attachment 186929
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=186929&action=review
> Source/WebCore/ChangeLog:8 > + With this patch we now pass 178 of 182 tests in http/tests/security/xssAuditor.
That's great!
> Source/WebCore/html/parser/HTMLDocumentParser.cpp:528 > + Settings* settings = document()->settings(); > + bool usesEncodingDetector = settings && settings->usesEncodingDetector(); > + String mimeType; > + if (DocumentLoader* loader = document()->loader()) > + mimeType = loader->responseMIMEType(); > + String defaultTextEncodingName; > + if (settings) > + defaultTextEncodingName = settings->defaultTextEncodingName(); > + RefPtr<TextResourceDecoder> decoder = TextResourceDecoder::create(mimeType, defaultTextEncodingName, usesEncodingDetector);
Can we break this logic into a separate function that returns a PassRefPtr<TextResourceDecoder> ?
Tony Gentilcore
Comment 11
2013-02-06 16:10:56 PST
(In reply to
comment #9
)
> (From update of
attachment 186929
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=186929&action=review
> > > Source/WebCore/html/parser/BackgroundHTMLParser.cpp:165 > > + { > > Why do we need these { } brackets?
To delete the CompactHTMLToken off the stack so that the token passes the isSafeToSendToAnotherThread() test.
> > > Source/WebCore/html/parser/HTMLDocumentParser.cpp:531 > > + ASSERT(decoder->hasOneRef()); > > Is the decoder itself safe to send to another thread? Perhaps we should add a isSafeToSendToAnotherThread function to it.
It looks safe to me. I don't see anything that we can validate (like Strings, RefPtrs, KURLs, etc). Would you mind double checking in case I missed something?
Adam Barth
Comment 12
2013-02-06 16:15:44 PST
Comment on
attachment 186929
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=186929&action=review
>>> Source/WebCore/html/parser/HTMLDocumentParser.cpp:531 >>> + ASSERT(decoder->hasOneRef()); >> >> Is the decoder itself safe to send to another thread? Perhaps we should add a isSafeToSendToAnotherThread function to it. > > It looks safe to me. I don't see anything that we can validate (like Strings, RefPtrs, KURLs, etc). Would you mind double checking in case I missed something?
TextResourceDecoder has a HTMLMetaCharsetParser, which uses HTMLNames. TextCodec has a bunch of subclasses, which I haven't studied in detail. We might need to look at this more carefully.
Adam Barth
Comment 13
2013-02-06 16:17:58 PST
Looks like we need to make sure we're not going to use the HTMLMetaCharsetParser.
Adam Barth
Comment 14
2013-02-06 16:18:51 PST
Comment on
attachment 186929
[details]
Patch The bulk of this change looks good, but we need to be more careful about moving the TextResourceDecoder between threads.
Eric Seidel (no email)
Comment 15
2013-02-06 16:19:13 PST
(In reply to
comment #13
)
> Looks like we need to make sure we're not going to use the HTMLMetaCharsetParser.
I think we're just gonna ned up taking the eventNames() per-thread AtomicString approach and not caring in the short-term. We need to see how much of a perf-hit that would be, but it would dramatically simplify moving code between threads like this.
Adam Barth
Comment 16
2013-02-06 16:20:21 PST
Maybe, but HTMLMetaCharsetParser calls , m_assumedCodec(newTextCodec(Latin1Encoding())) which might not be threadsafe either.
Adam Barth
Comment 17
2013-02-06 16:22:36 PST
static String fullyDecodeString(const String& string, const TextResourceDecoder* decoder) in XSSAuditor.cpp calls: const TextEncoding& encoding = decoder ? decoder->encoding() : UTF8Encoding(); It looks like we need only the TextEncoding, not the TextResourceDecoder.
Adam Barth
Comment 18
2013-02-06 16:26:20 PST
newTextCodec itself looks ok: PassOwnPtr<TextCodec> newTextCodec(const TextEncoding& encoding) { MutexLocker lock(encodingRegistryMutex()); ASSERT(textCodecMap); TextCodecFactory factory = textCodecMap->get(encoding.name()); ASSERT(factory.function); return factory.function(encoding, factory.additionalData); }
Tony Gentilcore
Comment 19
2013-02-06 16:48:57 PST
Created
attachment 186948
[details]
Patch
Tony Gentilcore
Comment 20
2013-02-06 16:49:35 PST
(In reply to
comment #17
)
> static String fullyDecodeString(const String& string, const TextResourceDecoder* decoder) > > in XSSAuditor.cpp calls: > > const TextEncoding& encoding = decoder ? decoder->encoding() : UTF8Encoding(); > > It looks like we need only the TextEncoding, not the TextResourceDecoder.
I'm really glad you noticed that! The patch is so much cleaner and safer now.
Tony Gentilcore
Comment 21
2013-02-06 16:50:29 PST
Now we pass 180 of 182 tests. Two were failing in the previous patch due to creating the wrong decoder type. Now we use the proper one from the Document.
Adam Barth
Comment 22
2013-02-06 18:04:01 PST
Comment on
attachment 186948
[details]
Patch Ah, much nicer.
WebKit Review Bot
Comment 23
2013-02-07 05:02:39 PST
Comment on
attachment 186948
[details]
Patch Clearing flags on attachment: 186948 Committed
r142099
: <
http://trac.webkit.org/changeset/142099
>
WebKit Review Bot
Comment 24
2013-02-07 05:02:46 PST
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug