Bug 249217 - Optimize StringImpl::lower() for 8-bit strings
Summary: Optimize StringImpl::lower() for 8-bit strings
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: Safari Technology Preview
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-12-13 05:37 PST by Ahmad Saleem
Modified: 2023-02-17 12:11 PST (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ahmad Saleem 2022-12-13 05:37:44 PST
Hi Team,

While going through Blink's commit - I found another potential optimization which might be or might not be something important needed in Webkit.

I just wanted to get an opinion so raising this as a bug.

Blink Commit - https://chromium.googlesource.com/chromium/blink/+/1be141ec1dd0fd7d34ea76d5dfd44efda521f09f

Performance Test Case - https://jsfiddle.net/s057bomc/

*** Safari 16.1 ***

Time:
values 422.16358839050105, 418.84816753926754, 419.94750656168003, 421.05263157894734, 419.9475065616798, 419.9475065616798, 421.05263157894785, 421.60737812911725, 421.05263157894734, 422.16358839050133, 422.16358839050133, 421.60737812911725, 423.2804232804233, 419.9475065616788, 422.72126816380654, 419.3971166448231, 424.4031830238737, 422.7212681638055, 420.4993429697766, 422.72126816380654 runs/s
avg 421.36227401814403 runs/s
median 421.33000485403255 runs/s
stdev 1.4461312032824363 runs/s
min 418.84816753926754 runs/s
max 424.4031830238737 runs/s

*** Chrome Canary 110 ***

Time:
values 1524.9112659631633, 1522.309710928858, 1533.782890121479, 1534.7975654693091, 1540.0955919208254, 1541.9380564797448, 1541.7331209736133, 1539.0739022291139, 1539.6867531897585, 1533.9857179672756, 1530.9489246413536, 1540.9139215798646, 1540.504647830488, 1540.9139212138487, 1542.1430470863916, 1535.407015706247, 1539.2781319513604, 1538.869726693626, 1532.3645967318255, 1540.913921335854 runs/s
avg 1536.7286215007 runs/s
median 1539.176017090237 runs/s
stdev 5.65061932398092 runs/s
min 1522.309710928858 runs/s
max 1542.1430470863916 runs/s

*** Firefox Nightly 109 ***


Time:
values 645.9627329192547, 650.8135168961202, 717.9487179487179, 704.4025157232704, 705.2896725440806, 650.8135168961202, 658.2278481012659, 651.6290726817043, 650.8135168961202, 706.1790668348045, 714.2857142857143, 705.2896725440806, 716.1125319693094, 716.1125319693094, 685.1119894598155, 700, 677.9661016949152, 718.8703465982028, 717.9487179487179, 706.1790668348045 runs/s
avg 689.9978425373165 runs/s
median 704.8460941336755 runs/s
stdev 27.951382058079524 runs/s
min 645.9627329192547 runs/s
max 718.8703465982028 runs/s


*** Safari Technology Preview 159 ***

Time:
values 416.6666666666669, 415.0453955901427, 417.75456919060076, 415.0453955901432, 418.3006535947707, 417.209908735331, 417.20990873533344, 417.7545691906, 417.20990873533344, 419.9475065616778, 421.0526315789494, 421.05263157894535, 417.20990873533344, 414.5077720207254, 418.30065359477123, 414.5077720207254, 416.66666666666566, 415.0453955901427, 414.5077720207254, 418.30065359477123 runs/s
avg 417.1648169996177 runs/s
median 417.20990873533344 runs/s
stdev 2.0222382228721028 runs/s
min 414.5077720207254 runs/s
max 421.0526315789494 runs/s

_____

Just wanted to raise this bug so something can be tracked or discussed.

Thanks!
Comment 1 Radar WebKit Bug Importer 2022-12-20 05:38:16 PST
<rdar://problem/103554544>
Comment 2 Darin Adler 2022-12-21 12:27:18 PST
This microbenchmark being 3-4x faster in Chromium is an excellent opportunity for optimization.

The StringImpl::lower part is a very small part of this (was 12% at the time they did it) but could still be a worthwhile thing to tighten up. Not sure we would want *exactly* what they did in Chromium since the most important case to optimize is probably no uppercase letters at all followed by all ASCII.
Comment 3 Chris Dumez 2023-02-17 12:10:33 PST
I profiled this benchmark and don't see any lowercasing near the top:
```
Sample Count, Samples %, Normalized CPU %, Symbol
19352, 30.2%, 8.4%, com.apple.WebKit (63672)
3114, 4.9%, 1.3%,     WebCore::Style::ElementRuleCollector::collectMatchingRules(WebCore::Style::MatchRequest const&) (in WebCore)
1766, 2.8%, 0.8%,     WebCore::Style::Invalidator::invalidateStyleForDescendants(WebCore::Element&, WebCore::Style::SelectorMatchingState*) (in WebCore)
1245, 1.9%, 0.5%,     WTF::fastMalloc(unsigned long) (in JavaScriptCore)
1225, 1.9%, 0.5%,     WTF::Vector<WebCore::Style::MatchedProperties, 0ul, WTF::CrashOnOverflow, 16ul, WTF::FastMalloc>::~Vector() (in WebCore)
1204, 1.9%, 0.5%,     pas_thread_local_cache_flush_deallocation_log (in JavaScriptCore)
1128, 1.8%, 0.5%,     WebCore::Style::ElementRuleCollector::~ElementRuleCollector() (in WebCore)
1010, 1.6%, 0.4%,     WTF::fastFree(void*) (in JavaScriptCore)
975, 1.5%, 0.4%,     WebCore::SelectorFilter::pushParent(WebCore::Element*) (in WebCore)
944, 1.5%, 0.4%,     WebCore::Style::Invalidator::invalidateIfNeeded(WebCore::Element&, WebCore::Style::SelectorMatchingState*) (in WebCore)
756, 1.2%, 0.3%,     WebCore::Style::ElementRuleCollector::collectMatchingRules(WebCore::Style::CascadeLevel) (in WebCore)
746, 1.2%, 0.3%,     WebCore::Style::ElementRuleCollector::ElementRuleCollector(WebCore::Element const&, WebCore::Style::RuleSet const&, WebCore::Style::SelectorMatchingState*) (in WebCore)
604, 0.9%, 0.3%,     WebCore::Style::ElementRuleCollector::collectMatchingRulesForList(WTF::Vector<WebCore::Style::RuleData, 1ul, WTF::CrashOnOverflow, 16ul, WTF::FastMalloc> const*, WebCore::Style::MatchRequest const&) (in WebCore)
528, 0.8%, 0.2%,     WebCore::Style::ElementRuleCollector::matchesAnyAuthorRules() (in WebCore)
390, 0.6%, 0.2%,     WebCore::SelectorFilter::popParent() (in WebCore)
223, 0.3%, 0.1%,     WebCore::NodeTraversal::nextAncestorSibling(WebCore::Node const&, WebCore::Node const*) (in WebCore)
220, 0.3%, 0.1%,     DYLD-STUB$$WTF::fastFree(void*) (in WebCore)
162, 0.3%, 0.1%,     WebCore::Node::containingShadowRoot() const (in WebCore)
149, 0.2%, 0.1%,     WebCore::LayoutRect::intersects(WebCore::LayoutRect const&) const (in WebCore)
131, 0.2%, 0.1%,     DYLD-STUB$$WTF::fastMalloc(unsigned long) (in WebCore)
122, 0.2%, 0.1%,     _platform_memmove (in libsystem_platform.dylib)
97, 0.2%, 0.0%,     WebCore::ScrollingStateNode::indexOfChild(WebCore::ScrollingStateNode&) const (in WebCore)
94, 0.1%, 0.0%,     WebCore::accumulateOffsetTowardsAncestor(WebCore::RenderLayer const*, WebCore::RenderLayer const*, WebCore::LayoutPoint&, WebCore::RenderLayer::ColumnOffsetAdjustment) (in WebCore)
73, 0.1%, 0.0%,     bmalloc_heap_config_specialized_local_allocator_try_allocate_small_segregated_slow (in JavaScriptCore)
44, 0.1%, 0.0%,     WebCore::RectList::intersects(WebCore::LayoutRect const&) const (in WebCore)
```