This is the cleanup for the FileReader patch (https://bugs.webkit.org/show_bug.cgi?id=38157) per the comment from Adam Barth: WebCore/html/FileReader.cpp:256 + if (m_rawData.size() >= 2 && m_rawData[0] == '\xFE' && m_rawData[1] == '\xFF') { Yuck. Is this in the spec? We should move this to a more generic location.
Created attachment 56101 [details] Proposed Patch
Comment on attachment 56101 [details] Proposed Patch Encoding sniffing is definitely not something the TextEncoding class should responsible for. And I think that we already have all the code for this - can't you just feed the data to TextResourceDecoder as text/plain?
Created attachment 56525 [details] Proposed Patch Changed to call TextResourceDecoder::decode.
+ // Per the File API spec, we SHOULD use the supplied encoding and do not check for BOM if the encoding is provided and valid. Since this is SHOULD, and not a MUST, we could also ignore this requirement, like we do for Web content (it's also debatable whether it's right to always have the BOM override every other indication of character set, but that's what we do). I don't have a strong opinion, just wanted to point out that it might be good to be consistent within WebKit.
Created attachment 56531 [details] Proposed Patch Your suggestion makes sense. I changed to ignore this requirement in order to be consistent with other part of WebKit.
Comment on attachment 56531 [details] Proposed Patch + m_result = m_decoder->decode(&m_rawData.at(0), m_rawData.size()); One also needs to call flush() to make sure that e.g. incomplete UTF-8 sequences at the end get converted to U+FFFD REPLACEMENT CHARACTER, and to be generally compatible with TextResourceDecoder design. r=me if you call flush() and append the result.
Fixed as suggested and committed as http://trac.webkit.org/changeset/59797.