RESOLVED FIXED232380
JavaScript URL result should be treated as UTF-8 bytes
https://bugs.webkit.org/show_bug.cgi?id=232380
Summary JavaScript URL result should be treated as UTF-8 bytes
Chris Dumez
Reported 2021-10-27 07:54:12 PDT
JavaScript URL result should be treated as UTF-8 bytes: - https://github.com/whatwg/html/pull/6781
Attachments
Patch (5.18 KB, patch)
2021-10-27 08:10 PDT, Chris Dumez
no flags
Patch (7.35 KB, patch)
2021-10-27 11:05 PDT, Chris Dumez
no flags
Chris Dumez
Comment 1 2021-10-27 08:10:08 PDT
Darin Adler
Comment 2 2021-10-27 10:06:39 PDT
Comment on attachment 442591 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=442591&action=review > Source/WebCore/loader/DocumentWriter.cpp:80 > + setEncoding("UTF-8"_s, false); Ah, false. So much better than true. (Why I like enumerations, I guess?) > Source/WebCore/loader/DocumentWriter.cpp:96 > + if (DocumentParser* parser = m_frame->document()->parser()) { > + auto utf8Source = source.utf8(); > + parser->appendBytes(*this, reinterpret_cast<const uint8_t*>(utf8Source.data()), utf8Source.length()); > + } If this is performance-critical, and all-ASCII is common, and the string is likely in 8-bit, then we could consider a fast path for that case. Or even come up with an alternative to String::utf8 that doesn’t allocate and free memory. Likely, though, it’s not important to optimize.
Chris Dumez
Comment 3 2021-10-27 10:20:39 PDT
(In reply to Darin Adler from comment #2) > Comment on attachment 442591 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=442591&action=review > > > Source/WebCore/loader/DocumentWriter.cpp:80 > > + setEncoding("UTF-8"_s, false); > > Ah, false. So much better than true. (Why I like enumerations, I guess?) Yes, it is a pre-existing issue but I'll replace with an enum class. > > > Source/WebCore/loader/DocumentWriter.cpp:96 > > + if (DocumentParser* parser = m_frame->document()->parser()) { > > + auto utf8Source = source.utf8(); > > + parser->appendBytes(*this, reinterpret_cast<const uint8_t*>(utf8Source.data()), utf8Source.length()); > > + } > > If this is performance-critical, and all-ASCII is common, and the string is > likely in 8-bit, then we could consider a fast path for that case. Or even > come up with an alternative to String::utf8 that doesn’t allocate and free > memory. > > Likely, though, it’s not important to optimize. I doubt this is worth optimizing at this point.
Chris Dumez
Comment 4 2021-10-27 11:05:56 PDT
EWS
Comment 5 2021-10-27 12:05:53 PDT
Committed r284934 (243604@main): <https://commits.webkit.org/243604@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 442608 [details].
Radar WebKit Bug Importer
Comment 6 2021-10-27 12:06:21 PDT
Note You need to log in before you can comment on or make changes to this bug.