Bug 12931

Summary: ARM crash due to non-aligned memory access
Product: WebKit Reporter: Krzysztof Kowalczyk <kkowalczyk>
Component: New BugsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal    
Priority: P2    
Version: 523.x (Safari 3)   
Hardware: PC   
OS: OS X 10.4   
Attachments:
Description Flags
ARM crash due to non-aligned memory access
darin: review-
Updated per comments
none
Updated per comments
darin: review-
Updated per review comments darin: review+

Description Krzysztof Kowalczyk 2007-02-28 17:08:03 PST
ARM crash due to non-aligned memory access
Comment 1 Krzysztof Kowalczyk 2007-02-28 17:12:10 PST
Created attachment 13431 [details]
ARM crash due to non-aligned memory access

On some ARM systems accessing 32-bit values at non-4-aligned addresses crashes or silently reads incorrect data (depending on how kernel handles non-aligned access trap). This patch ensures that if any of the strings are not 4-byte aligned, we do the slower, but working, 16-bit access.
Comment 2 Darin Adler 2007-02-28 17:20:35 PST
Comment on attachment 13431 [details]
ARM crash due to non-aligned memory access

This will break the build on Mac OS X, because we will get a warning (treated as an error) about the aligned4 function unused but having internal linkage.

Is it really worthwhile to optimize the aligned case? Instead we should probably just always use the character loop on ARM and ignore the faster loop entirely.
Comment 3 Krzysztof Kowalczyk 2007-02-28 17:39:10 PST
Created attachment 13432 [details]
Updated per comments

I assume it would be faster to do 32-bit at a time vs. 16-bit at a time, but I have no data to back it up so updated patch gets rid of aligned4() function and always does 16-bit access.
Comment 4 Krzysztof Kowalczyk 2007-02-28 17:39:28 PST
Created attachment 13433 [details]
Updated per comments

I assume it would be faster to do 32-bit at a time vs. 16-bit at a time, but I have no data to back it up so updated patch gets rid of aligned4() function and always does 16-bit access.
Comment 5 Darin Adler 2007-03-01 10:21:35 PST
Comment on attachment 13433 [details]
Updated per comments

Looks fine.

+        const uint16_t* strChars = reinterpret_cast<const uint16_t*>(str->characters());
+        const uint16_t* bufChars = reinterpret_cast<const uint16_t*>(buf.s);

These should just be const UChar* instead of const uint16_t*, and then you won't need reinterpret_cast.

I think in the future we might want to reverse the if and enable the fast 32-bit-at-a-time path only on certain platforms. The ARM case can be the normal case.

In fact, I don't think the ARM side needs a comment. Instead the comment should be on the 32-bit side. Something like:

    // On platforms where it's safe to do so, it's faster to go 4 bytes at a time.

review- because of the unneeded reinterpret_cast.
Comment 6 Krzysztof Kowalczyk 2007-03-01 15:58:07 PST
Created attachment 13443 [details]
Updated per review comments

Removed unnecessary reinterpret_cast, move comment from ARM part to the other part.
Comment 7 Darin Adler 2007-03-01 17:28:16 PST
Comment on attachment 13443 [details]
Updated per review comments

r=me
Comment 8 Krzysztof Kowalczyk 2007-03-01 18:11:08 PST
landed in r19930