Bug 209847 - [Win][WK2] Connection::platformInvalidate should cancel async ReadFile operation by using CancelIo API
Summary: [Win][WK2] Connection::platformInvalidate should cancel async ReadFile operat...
Status: REOPENED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-31 21:45 PDT by Fujii Hironori
Modified: 2020-04-30 21:14 PDT (History)
0 users

See Also:


Attachments
CrashLog_0a48_2020-03-31_15-40-57-287.txt (21.12 KB, text/plain)
2020-03-31 22:07 PDT, Fujii Hironori
no flags Details
CrashLog_13dc_2020-03-31_16-32-33-206.txt (19.01 KB, text/plain)
2020-03-31 22:07 PDT, Fujii Hironori
no flags Details
CrashLog_1518_2020-03-31_16-30-30-093.txt (19.08 KB, text/plain)
2020-03-31 22:07 PDT, Fujii Hironori
no flags Details
CrashLog_1eec_2020-03-31_23-48-43-898.txt (22.45 KB, text/plain)
2020-04-01 01:16 PDT, Fujii Hironori
no flags Details
CrashLog_19f8_2020-04-01_00-25-00-638.txt (23.16 KB, text/plain)
2020-04-01 01:16 PDT, Fujii Hironori
no flags Details
CrashLog_1524_2020-04-01_00-30-06-364.txt (22.48 KB, text/plain)
2020-04-01 01:16 PDT, Fujii Hironori
no flags Details
navigator-language-fr-crash-log.txt (37.63 KB, text/plain)
2020-04-07 01:21 PDT, Fujii Hironori
no flags Details
idbdatabase-deleteobjectstore-failures-crash-log.txt (35.99 KB, text/plain)
2020-04-07 01:21 PDT, Fujii Hironori
no flags Details
[Patch] Adding HeapValidate (687 bytes, patch)
2020-04-09 14:45 PDT, Fujii Hironori
no flags Details | Formatted Diff | Diff
[Patch] Remove pendingCallbacks of WebsiteDataStore::removeData's CallbackAggregator (1.42 KB, patch)
2020-04-10 15:46 PDT, Fujii Hironori
no flags Details | Formatted Diff | Diff
debugging patch to do _heapchk in fastFree (390 bytes, patch)
2020-04-12 18:59 PDT, Fujii Hironori
no flags Details | Formatted Diff | Diff
CrashLog of attachment 396247 (102.72 KB, text/plain)
2020-04-12 19:04 PDT, Fujii Hironori
no flags Details
Patch to use mimalloc (1.17 KB, patch)
2020-04-15 23:18 PDT, Fujii Hironori
no flags Details | Formatted Diff | Diff
crash logs.zip (576.04 KB, application/x-zip-compressed)
2020-04-16 13:06 PDT, Fujii Hironori
no flags Details
crash logs.zip (85.00 KB, application/x-zip-compressed)
2020-04-16 14:12 PDT, Fujii Hironori
no flags Details
Patch to replace m_readBuffer's VectorMalloc with mimalloc (2.58 KB, patch)
2020-04-19 14:46 PDT, Fujii Hironori
no flags Details | Formatted Diff | Diff
Patch to leak the vector buffer in Connection::platformInvalidate to avoid heap corruption (549 bytes, patch)
2020-04-19 22:12 PDT, Fujii Hironori
no flags Details | Formatted Diff | Diff
WIP Patch (500 bytes, patch)
2020-04-22 21:04 PDT, Fujii Hironori
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Fujii Hironori 2020-03-31 21:45:44 PDT
[WinCairo][WK2] random crashes by heap corruption

Pavel Feldman reported:

> Unrelatedly, I was trying to build clang/ASan, but failed. We are
> seeing intermittent heap corruption in Win port running under
> HyperV. Running it with app verifier /full mode hides that issue,
> i.e. corruption never occurs. I know that VC asan for 64 bit is
> in the works, but I wonder if anyone was building asan.
> It never happens in normal setup, no matter how much I stress the
> system, only fails in HyperV. So it might have to do with limited
> cores available...
Comment 1 Fujii Hironori 2020-03-31 22:07:19 PDT
Created attachment 395129 [details]
CrashLog_0a48_2020-03-31_15-40-57-287.txt
Comment 2 Fujii Hironori 2020-03-31 22:07:38 PDT
Created attachment 395130 [details]
CrashLog_13dc_2020-03-31_16-32-33-206.txt
Comment 3 Fujii Hironori 2020-03-31 22:07:53 PDT
Created attachment 395131 [details]
CrashLog_1518_2020-03-31_16-30-30-093.txt
Comment 4 Fujii Hironori 2020-03-31 22:11:33 PDT
backtraces from the log

CrashLog_0a48_2020-03-31_15-40-57-287.txt

.  0  Id: 4f8.1a48 Suspend: 1 Teb: 00000093`a6d5c000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 00000093`a6efa370 00007ffc`fe479273 ntdll!RtlIsNonEmptyDirectoryReparsePointAllowed+0xe9
01 00000093`a6efa3c0 00007ffc`fe481662 ntdll!RtlIsNonEmptyDirectoryReparsePointAllowed+0xb3
02 00000093`a6efa4b0 00007ffc`fe48196a ntdll!RtlpNtMakeTemporaryKey+0x492
03 00000093`a6efa4e0 00007ffc`fe48a929 ntdll!RtlpNtMakeTemporaryKey+0x79a
04 00000093`a6efa510 00007ffc`fe3be511 ntdll!RtlpNtMakeTemporaryKey+0x9759
05 00000093`a6efa540 00007ffc`fe3bbabb ntdll!RtlAllocateHeap+0x2ca1
06 00000093`a6efa720 00007ffc`fe3dc02c ntdll!RtlAllocateHeap+0x24b
07 00000093`a6efa830 00007ffc`fe3dbd97 ntdll!RtlCreateProcessParametersWithTemplate+0x39c
08 00000093`a6efa8e0 00007ffc`fb66d078 ntdll!RtlCreateProcessParametersWithTemplate+0x107
09 00000093`a6efa960 00007ffc`fb66f1dd KERNELBASE!IsProcessInJob+0x258
0a 00000093`a6efae30 00007ffc`fb66bdc6 KERNELBASE!CreateProcessInternalW+0x1d2d
0b 00000093`a6efbf80 00007ffc`fdbdbe93 KERNELBASE!CreateProcessW+0x66
0c 00000093`a6efbff0 00007ffc`dadecc79 KERNEL32!CreateProcessW+0x53
0d 00000093`a6efc050 00007ffc`db096fcb WebKit2!WebKit::ProcessLauncher::launchProcess(void)+0x499 [C:\jenkins_slave\WinCairo-master\Source\WebKit\UIProcess\Launcher\win\ProcessLauncherWin.cpp @ 105]
0e 00000093`a6efc3d0 00007ffc`dafff52d WebKit2!WebKit::ProcessLauncher::ProcessLauncher(class WebKit::ProcessLauncher::Client * client = <Value unavailable error>, struct WebKit::ProcessLauncher::LaunchOptions * launchOptions = <Value unavailable error>)+0x6b [C:\jenkins_slave\WinCairo-master\Source\WebKit\UIProcess\Launcher\ProcessLauncher.cpp @ 41]
0f (Inline Function) --------`-------- WebKit2!WebKit::ProcessLauncher::create+0x1c [C:\jenkins_slave\WinCairo-master\Source\WebKit\UIProcess\Launcher\ProcessLauncher.h @ 92]
10 00000093`a6efc400 00007ffc`db09ae62 WebKit2!WebKit::AuxiliaryProcessProxy::connect(void)+0x5d [C:\jenkins_slave\WinCairo-master\Source\WebKit\UIProcess\AuxiliaryProcessProxy.cpp @ 97]
11 00000093`a6efc480 00007ffc`db04e45b WebKit2!WebKit::NetworkProcessProxy::NetworkProcessProxy(class WebKit::WebProcessPool * processPool = 0x00000282`4c739ab0)+0xb2 [C:\jenkins_slave\WinCairo-master\Source\WebKit\UIProcess\Network\NetworkProcessProxy.cpp @ 95]
12 (Inline Function) --------`-------- WebKit2!std::make_unique+0x19 [C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.24.28314\include\memory @ 2055]
13 (Inline Function) --------`-------- WebKit2!WTF::makeUnique+0x19 [C:\jenkins_slave\WinCairo-master\WebKitBuild\Release\WTF\Headers\wtf\StdLibExtras.h @ 483]
14 00000093`a6efc500 00007ffc`db0b93db WebKit2!WebKit::WebProcessPool::ensureNetworkProcess(class WebKit::WebsiteDataStore * withWebsiteDataStore = 0x00000282`49f9fed0)+0x13b [C:\jenkins_slave\WinCairo-master\Source\WebKit\UIProcess\WebProcessPool.cpp @ 505]
15 00000093`a6efc960 00007ffc`db08c6bd WebKit2!WebKit::WebsiteDataStore::removeData(class WTF::OptionSet<WebKit::WebsiteDataType> dataTypes = <Value unavailable error>, class WTF::WallTime modifiedSince = class WTF::WallTime, class WTF::Function<void ()> * completionHandler = <Value unavailable error>)+0x28b [C:\jenkins_slave\WinCairo-master\Source\WebKit\UIProcess\WebsiteData\WebsiteDataStore.cpp @ 726]
16 00000093`a6efca30 00007ffc`e462681c WebKit2!WKWebsiteDataStoreRemoveAllServiceWorkerRegistrations(struct OpaqueWKWebsiteDataStore * dataStoreRef = <Value unavailable error>, void * context = 0x00000093`a6efcae8, <function> * callback = 0x00007ffc`e4631a30)+0x5d [C:\jenkins_slave\WinCairo-master\Source\WebKit\UIProcess\API\C\WKWebsiteDataStoreRef.cpp @ 662]
17 (Inline Function) --------`-------- WebKitTestRunnerLib!WTR::TestController::clearServiceWorkerRegistrations+0x29 [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\TestController.cpp @ 3172]
18 00000093`a6efca80 00007ffc`e46261c9 WebKitTestRunnerLib!WTR::TestController::resetStateToConsistentValues(struct WTR::TestOptions * options = 0x00000093`a6efcb88, WTR::TestController::ResetStage resetStage = BeforeTest (0n0))+0x2dc [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\TestController.cpp @ 1033]
19 00000093`a6efcb50 00007ffc`e462b421 WebKitTestRunnerLib!WTR::TestController::ensureViewSupportsOptionsForTest(class WTR::TestInvocation * test = <Value unavailable error>)+0x189 [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\TestController.cpp @ 822]
1a 00000093`a6efcc80 00007ffc`e4640ccd WebKitTestRunnerLib!WTR::TestController::configureViewForTest(class WTR::TestInvocation * test = 0x00000282`4c763510)+0x11 [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\TestController.cpp @ 1561]
1b 00000093`a6efccc0 00007ffc`e462c22f WebKitTestRunnerLib!WTR::TestInvocation::invoke(void)+0x2d [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\TestInvocation.cpp @ 161]
1c 00000093`a6efcd20 00007ffc`e462c72c WebKitTestRunnerLib!WTR::TestController::runTest(char * inputLine = <Value unavailable error>)+0xb6f [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\TestController.cpp @ 1781]
1d 00000093`a6efcff0 00007ffc`e46220ea WebKitTestRunnerLib!WTR::TestController::runTestingServerLoop(void)+0x9c [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\TestController.cpp @ 1815]
1e (Inline Function) --------`-------- WebKitTestRunnerLib!WTR::TestController::run+0xe [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\TestController.cpp @ 1834]
1f 00000093`a6efd840 00007ffc`e465b40c WebKitTestRunnerLib!WTR::TestController::TestController(int argc = <Value unavailable error>, char ** argv = <Value unavailable error>)+0x12a [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\TestController.cpp @ 167]
20 00000093`a6efd890 00007ff7`8cf31734 WebKitTestRunnerLib!dllLauncherEntryPoint(int argc = <Value unavailable error>, char ** argv = <Value unavailable error>)+0x2c [C:\jenkins_slave\WinCairo-master\Tools\WebKitTestRunner\win\main.cpp @ 34]
21 00000093`a6efda50 00007ff7`8cf3321c WebKitTestRunner!main(int argc = 0n2, char ** argv = 0x00000282`49f96ac0)+0x734 [C:\jenkins_slave\WinCairo-master\Tools\win\DLLLauncher\DLLLauncherMain.cpp @ 218]
22 (Inline Function) --------`-------- WebKitTestRunner!invoke_main+0x22 [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78]
23 00000093`a6effdb0 00007ffc`fdbd7bd4 WebKitTestRunner!__scrt_common_main_seh(void)+0x10c [d:\agent\_work\5\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288]
24 00000093`a6effdf0 00007ffc`fe3eced1 KERNEL32!BaseThreadInitThunk+0x14
25 00000093`a6effe20 00000000`00000000 ntdll!RtlUserThreadStart+0x21

CrashLog_13dc_2020-03-31_16-32-33-206.txt

#  2  Id: 1810.102c Suspend: 1 Teb: 000000ce`9fba2000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 000000ce`a00ff220 00007ffc`fe479273 ntdll!RtlIsNonEmptyDirectoryReparsePointAllowed+0xe9
01 000000ce`a00ff270 00007ffc`fe481662 ntdll!RtlIsNonEmptyDirectoryReparsePointAllowed+0xb3
02 000000ce`a00ff360 00007ffc`fe48196a ntdll!RtlpNtMakeTemporaryKey+0x492
03 000000ce`a00ff390 00007ffc`fe48a929 ntdll!RtlpNtMakeTemporaryKey+0x79a
04 000000ce`a00ff3c0 00007ffc`fe3be511 ntdll!RtlpNtMakeTemporaryKey+0x9759
05 000000ce`a00ff3f0 00007ffc`fe3bbabb ntdll!RtlAllocateHeap+0x2ca1
06 000000ce`a00ff5d0 00007ffc`fb982596 ntdll!RtlAllocateHeap+0x24b
07 000000ce`a00ff6e0 00007ffc`e44d9b7a ucrtbase!malloc_base+0x36
08 000000ce`a00ff710 00007ffc`dafc4a5f WTF!WTF::fastMalloc(unsigned int64 n = <Value unavailable error>)+0xa [C:\jenkins_slave\WinCairo-master\Source\WTF\wtf\FastMalloc.cpp @ 202]
09 (Inline Function) --------`-------- WebKit2!IPC::copyBuffer+0x9 [C:\jenkins_slave\WinCairo-master\Source\WebKit\Platform\IPC\Decoder.cpp @ 41]
0a 000000ce`a00ff740 00007ffc`dade6941 WebKit2!IPC::Decoder::Decoder(unsigned char * buffer = 0x000001e2`cdf91f00 "", unsigned int64 bufferSize = 0x4e4, <function> * bufferDeallocator = 0x00000000`00000000, class WTF::Vector<IPC::Attachment,0,WTF::CrashOnOverflow,16,WTF::FastMalloc> * attachments = 0x000000ce`a00ff7e0)+0x2f [C:\jenkins_slave\WinCairo-master\Source\WebKit\Platform\IPC\Decoder.cpp @ 53]
0b (Inline Function) --------`-------- WebKit2!std::make_unique+0x24 [C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.24.28314\include\memory @ 2055]
0c (Inline Function) --------`-------- WebKit2!WTF::makeUnique+0x24 [C:\jenkins_slave\WinCairo-master\WebKitBuild\Release\WTF\Headers\wtf\StdLibExtras.h @ 483]
0d 000000ce`a00ff7a0 00007ffc`e455242a WebKit2!IPC::Connection::readEventHandler(void)+0xe1 [C:\jenkins_slave\WinCairo-master\Source\WebKit\Platform\IPC\win\ConnectionWin.cpp @ 144]
0e (Inline Function) --------`-------- WTF!WTF::Function<void +0x1e [C:\jenkins_slave\WinCairo-master\Source\WTF\wtf\Function.h @ 84]
0f 000000ce`a00ff860 00007ffc`e455236e WTF!WTF::WorkQueue::performWorkOnRegisteredWorkThread(void)+0x8a [C:\jenkins_slave\WinCairo-master\Source\WTF\wtf\win\WorkQueueWin.cpp @ 61]
10 000000ce`a00ff8e0 00007ffc`fe3af6d5 WTF!WTF::WorkQueue::workThreadCallback(void * context = 0x000001e2`ce95d650)+0x1e [C:\jenkins_slave\WinCairo-master\Source\WTF\wtf\win\WorkQueueWin.cpp @ 44]
11 000000ce`a00ff910 00007ffc`fe3b4634 ntdll!LdrUnloadDll+0x325
12 000000ce`a00ff9f0 00007ffc`fdbd7bd4 ntdll!RtlInitializeResource+0xce4
13 000000ce`a00ffdb0 00007ffc`fe3eced1 KERNEL32!BaseThreadInitThunk+0x14
14 000000ce`a00ffde0 00000000`00000000 ntdll!RtlUserThreadStart+0x21

CrashLog_1518_2020-03-31_16-30-30-093.txt

#  2  Id: 15fc.1b50 Suspend: 1 Teb: 0000006c`30e3b000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 0000006c`313ff0a0 00007ffc`fe479273 ntdll!RtlIsNonEmptyDirectoryReparsePointAllowed+0xe9
01 0000006c`313ff0f0 00007ffc`fe481662 ntdll!RtlIsNonEmptyDirectoryReparsePointAllowed+0xb3
02 0000006c`313ff1e0 00007ffc`fe48196a ntdll!RtlpNtMakeTemporaryKey+0x492
03 0000006c`313ff210 00007ffc`fe48a929 ntdll!RtlpNtMakeTemporaryKey+0x79a
04 0000006c`313ff240 00007ffc`fe3be511 ntdll!RtlpNtMakeTemporaryKey+0x9759
05 0000006c`313ff270 00007ffc`fe3bbabb ntdll!RtlAllocateHeap+0x2ca1
06 0000006c`313ff450 00007ffc`fb982596 ntdll!RtlAllocateHeap+0x24b
07 0000006c`313ff560 00007ffc`e44d9b7a ucrtbase!malloc_base+0x36
08 0000006c`313ff590 00007ffc`dade6be4 WTF!WTF::fastMalloc(unsigned int64 n = <Value unavailable error>)+0xa [C:\jenkins_slave\WinCairo-master\Source\WTF\wtf\FastMalloc.cpp @ 202]
09 (Inline Function) --------`-------- WebKit2!WTF::FastMalloc::malloc+0x6 [C:\jenkins_slave\WinCairo-master\WebKitBuild\Release\WTF\Headers\wtf\FastMalloc.h @ 195]
0a (Inline Function) --------`-------- WebKit2!WTF::VectorBufferBase<unsigned char,WTF::FastMalloc>::allocateBuffer+0x20 [C:\jenkins_slave\WinCairo-master\WebKitBuild\Release\WTF\Headers\wtf\Vector.h @ 292]
0b (Inline Function) --------`-------- WebKit2!WTF::Vector<unsigned char,0,WTF::CrashOnOverflow,16,WTF::FastMalloc>::reserveCapacity+0x25 [C:\jenkins_slave\WinCairo-master\WebKitBuild\Release\WTF\Headers\wtf\Vector.h @ 1189]
0c (Inline Function) --------`-------- WebKit2!WTF::Vector<unsigned char,0,WTF::CrashOnOverflow,16,WTF::FastMalloc>::expandCapacity+0x174 [C:\jenkins_slave\WinCairo-master\WebKitBuild\Release\WTF\Headers\wtf\Vector.h @ 1047]
0d (Inline Function) --------`-------- WebKit2!WTF::Vector<unsigned char,0,WTF::CrashOnOverflow,16,WTF::FastMalloc>::grow+0x18c [C:\jenkins_slave\WinCairo-master\WebKitBuild\Release\WTF\Headers\wtf\Vector.h @ 1128]
0e 0000006c`313ff5c0 00007ffc`e455242a WebKit2!IPC::Connection::readEventHandler(void)+0x384 [C:\jenkins_slave\WinCairo-master\Source\WebKit\Platform\IPC\win\ConnectionWin.cpp @ 124]
0f (Inline Function) --------`-------- WTF!WTF::Function<void +0x1e [C:\jenkins_slave\WinCairo-master\Source\WTF\wtf\Function.h @ 84]
10 0000006c`313ff680 00007ffc`e455236e WTF!WTF::WorkQueue::performWorkOnRegisteredWorkThread(void)+0x8a [C:\jenkins_slave\WinCairo-master\Source\WTF\wtf\win\WorkQueueWin.cpp @ 61]
11 0000006c`313ff700 00007ffc`fe3af6d5 WTF!WTF::WorkQueue::workThreadCallback(void * context = 0x00000236`4ab6e390)+0x1e [C:\jenkins_slave\WinCairo-master\Source\WTF\wtf\win\WorkQueueWin.cpp @ 44]
12 0000006c`313ff730 00007ffc`fe3b4634 ntdll!LdrUnloadDll+0x325
13 0000006c`313ff810 00007ffc`fdbd7bd4 ntdll!RtlInitializeResource+0xce4
14 0000006c`313ffbd0 00007ffc`fe3eced1 KERNEL32!BaseThreadInitThunk+0x14
15 0000006c`313ffc00 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Comment 5 Fujii Hironori 2020-03-31 22:13:16 PDT
All crashes happened in UI process, and their ExceptionCode were c0000374.
I don't observe a similar crash in Jenkin WinCairo WK1 layout testing job.
Comment 6 Fujii Hironori 2020-03-31 22:19:25 PDT
I'm observing this issue only in internal Jenkins jobs to run LayoutTest of WinCairo WK2.
I can't reproduce the crash on my PC by downloading the build artifact from the Jenkins job.
Comment 7 Fujii Hironori 2020-04-01 01:16:07 PDT
Created attachment 395142 [details]
CrashLog_1eec_2020-03-31_23-48-43-898.txt

The internal Jenkins job to run WinCairo WK2 Debug LayoutTests is also showing the c0000374 crashes.
Comment 8 Fujii Hironori 2020-04-01 01:16:20 PDT
Created attachment 395143 [details]
CrashLog_19f8_2020-04-01_00-25-00-638.txt
Comment 9 Fujii Hironori 2020-04-01 01:16:32 PDT
Created attachment 395144 [details]
CrashLog_1524_2020-04-01_00-30-06-364.txt
Comment 10 Fujii Hironori 2020-04-07 01:21:20 PDT
Created attachment 395660 [details]
navigator-language-fr-crash-log.txt

I reproduced the c0000374 crash on my PC by limiting the number of CPU to 2 with msconfig.exe
I used WinCairo build artifact downloaded from internal Jenkins.

The crash logs were generated by the following tests.
fast/text/international/system-language/navigator-language/navigator-language-fr.html
storage/indexeddb/modern/idbdatabase-deleteobjectstore-failures.html
Comment 11 Fujii Hironori 2020-04-07 01:21:34 PDT
Created attachment 395661 [details]
idbdatabase-deleteobjectstore-failures-crash-log.txt
Comment 12 Fujii Hironori 2020-04-08 17:23:01 PDT
Steps to reproduce:

1. Limit the number of CPU to 2 with msconfig.exe
2. Reboot Windows
3. Repeat python ./Tools/Scripts/run-webkit-tests --release --no-new-test-results --no-retry-failures --wincairo -f fast/text/international/system-language/navigator-language
Comment 13 Fujii Hironori 2020-04-08 18:42:52 PDT
I tested several WinCairo WebKitTestRunner. This crash seems to have happened since the beginning of WinCairo WK2.

r246418: Crashed
r248403: Crashed
r249458: Crashed
r256008: Crashed
r258670: Crashed
r259306: Crashed
Comment 14 Fujii Hironori 2020-04-08 20:32:45 PDT
The crash can be reproduced without changing the boot parameter.

1. Start PowerShell
2. Open Task Manager
3. Right click at powershell.exe
4. Select "Set Affinity"
5. Check CPU 2 and CPU 3
6. python ./Tools/Scripts/run-webkit-tests --release --no-new-test-results --no-retry-failures --wincairo -f --iterations=10 --child-processes=2 fast/text/international/system-language/navigator-language

However, the reproduction rate is lower than Comment 12.
Comment 15 Fujii Hironori 2020-04-09 14:45:20 PDT
Created attachment 396013 [details]
[Patch] Adding HeapValidate

I added HeapValidate(GetProcessHeap(), 0, nullptr) in TestController::clearIndexedDatabases before and after WKWebsiteDataStoreRemoveAllIndexedDatabases.
The first HeapValidate doesn't crash, but the second crashs.
So, WKWebsiteDataStoreRemoveAllIndexedDatabases seems the culprit.

Surprisingly, stopping calling WKWebsiteDataStoreRemoveAllIndexedDatabases doesn't solve the head corruption crashes.
After stopping calling WKWebsiteDataStoreRemoveAllIndexedDatabases, most crashes happens in WebKit::BackingStore::incorporateUpdate.
https://gist.github.com/fujii/78afed5686d6e48d8b42a2fdf9e6295e

My current hypothesis are:

1. There are several threading issues in WKWebsiteDataStoreRemoveAllIndexedDatabases and WebKit::BackingStore::incorporateUpdate and more.
2. There is a threading issue in thread primitive or fundamental part.
Comment 16 Fujii Hironori 2020-04-09 14:53:39 PDT
(In reply to Fujii Hironori from comment #15)
> 2. There is a threading issue in thread primitive or fundamental part.

This is unlikely. If so, Web process also should crash.
Comment 17 Fujii Hironori 2020-04-10 15:46:56 PDT
Created attachment 396132 [details]
[Patch] Remove pendingCallbacks of WebsiteDataStore::removeData's CallbackAggregator

This patch seems to fix the heap corruption of WebsiteDataStore::removeData. But, I don't know why.
Comment 18 Fujii Hironori 2020-04-12 17:51:34 PDT
(In reply to Fujii Hironori from comment #15)
> Created attachment 396013 [details]
> [Patch] Adding HeapValidate
> 
> I added HeapValidate(GetProcessHeap(), 0, nullptr) in
> TestController::clearIndexedDatabases before and after
> WKWebsiteDataStoreRemoveAllIndexedDatabases.
> The first HeapValidate doesn't crash, but the second crashs.
> So, WKWebsiteDataStoreRemoveAllIndexedDatabases seems the culprit.
> 
> Surprisingly, stopping calling WKWebsiteDataStoreRemoveAllIndexedDatabases
> doesn't solve the head corruption crashes.
> After stopping calling WKWebsiteDataStoreRemoveAllIndexedDatabases, most
> crashes happens in WebKit::BackingStore::incorporateUpdate.
> https://gist.github.com/fujii/78afed5686d6e48d8b42a2fdf9e6295e
> 
> My current hypothesis are:
> 
> 1. There are several threading issues in
> WKWebsiteDataStoreRemoveAllIndexedDatabases and
> WebKit::BackingStore::incorporateUpdate and more.
> 2. There is a threading issue in thread primitive or fundamental part.

It turned out these ideas were wrong.
The crashes happen by running the run loop.
Comment 19 Fujii Hironori 2020-04-12 18:59:43 PDT
Created attachment 396247 [details]
debugging patch to do _heapchk in fastFree

I added _heapchk in fastFree.
Comment 20 Fujii Hironori 2020-04-12 19:04:05 PDT
Created attachment 396248 [details]
CrashLog of attachment 396247 [details]

Then, I got the following backtrace.

> #  3  Id: 165c0.16ab4 Suspend: 1 Teb: 00000028`609d8000 Unfrozen
>  # Child-SP          RetAddr           Call Site
> 00 00000028`60fff1c0 00007ffb`a3599273 ntdll!RtlReportFatalFailure+0x9
> 01 00000028`60fff210 00007ffb`a35a1662 ntdll!RtlReportCriticalFailure+0x97
> 02 00000028`60fff300 00007ffb`a35a196a ntdll!RtlpHeapHandleError+0x12
> 03 00000028`60fff330 00007ffb`a35aa929 ntdll!RtlpHpHeapHandleError+0x7a
> 04 00000028`60fff360 00007ffb`a35a1571 ntdll!RtlpLogHeapFailure+0x45
> 05 00000028`60fff390 00007ffb`a35a6493 ntdll!RtlpAnalyzeHeapFailure+0x2fd
> 06 00000028`60fff3f0 00007ffb`a350fe12 ntdll!RtlpValidateHeap+0x8b
> 07 00000028`60fff480 00007ffb`a04f5f7b ntdll!RtlValidateHeap+0xc2
> 08 00000028`60fff4d0 00007ffb`a14fb716 KERNELBASE!HeapValidate+0xb
> 09 00000028`60fff500 00007ffb`74e49b50 ucrtbase!heapchk+0x16
> 0a 00000028`60fff530 00007ffb`5f998ad0 WTF!WTF::fastFree(void * p = <Value unavailable error>)+0x10 [S:\gc\Source\WTF\wtf\FastMalloc.cpp @ 227]
> 0b (Inline Function) --------`-------- WebKit2!WTF::Detail::CallableWrapperBase<void>::operator delete+0x9 [S:\gc\WebKitBuild\Release\WTF\Headers\wtf\Function.h @ 37]
> 0c 00000028`60fff560 00007ffb`74ec162c WebKit2!WTF::Detail::CallableWrapper<`lambda at ..\..\Source\WebKit\Platform\IPC\win\ConnectionWin.cpp:240:33',void>::~CallableWrapper(int should_call_delete = 0n1)+0x30 [S:\gc\WebKitBuild\Release\WTF\Headers\wtf\Function.h @ 46]
> 0d (Inline Function) --------`-------- WTF!std::default_delete<WTF::Detail::CallableWrapperBase<void> >::operator()+0xa [C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.25.28610\include\memory @ 1758]
> 0e (Inline Function) --------`-------- WTF!std::unique_ptr<WTF::Detail::CallableWrapperBase<void>,std::default_delete<WTF::Detail::CallableWrapperBase<void> > >::~unique_ptr+0x13 [C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.25.28610\include\memory @ 1873]
> 0f (Inline Function) --------`-------- WTF!WTF::Function<void +0x13 [S:\gc\Source\WTF\wtf\Function.h @ 59]
> 10 (Inline Function) --------`-------- WTF!WTF::VectorDestructor<1,WTF::Function<void +0x1c [S:\gc\Source\WTF\wtf\Vector.h @ 66]
> 11 (Inline Function) --------`-------- WTF!WTF::VectorTypeOperations<WTF::Function<void +0x1c [S:\gc\Source\WTF\wtf\Vector.h @ 242]
> 12 (Inline Function) --------`-------- WTF!WTF::Vector<WTF::Function<void +0x1c [S:\gc\Source\WTF\wtf\Vector.h @ 677]
> 13 00000028`60fff5a0 00007ffb`74ec152e WTF!WTF::WorkQueue::performWorkOnRegisteredWorkThread(void)+0xcc [S:\gc\Source\WTF\wtf\win\WorkQueueWin.cpp @ 64]
> 14 00000028`60fff620 00007ffb`a34cf6d5 WTF!WTF::WorkQueue::workThreadCallback(void * context = 0x00000158`fd209a00)+0x1e [S:\gc\Source\WTF\wtf\win\WorkQueueWin.cpp @ 44]
> 15 00000028`60fff650 00007ffb`a34d4634 ntdll!RtlpTpWorkCallback+0x165
> 16 00000028`60fff730 00007ffb`a1f67bd4 ntdll!TppWorkerThread+0x8d4
> 17 00000028`60fffaf0 00007ffb`a350ced1 KERNEL32!BaseThreadInitThunk+0x14
> 18 00000028`60fffb20 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Comment 21 Fujii Hironori 2020-04-15 23:18:18 PDT
Created attachment 396625 [details]
Patch to use mimalloc

https://github.com/microsoft/mimalloc

1. Build mimalloc
2. Copy mimalloc-static.lib to WebKitLibraries/win/lib64
3. Copy mimalloc-override.h and mimalloc.h to mimalloc.hWebKitLibraries/win/include
4. Apply the patch
5. Build WinCairo

I observe no heap corruption crashes by using mimalloc.
This seems the only one reasonable workaround at the moment.
Comment 22 Fujii Hironori 2020-04-16 13:06:19 PDT
Created attachment 396691 [details]
crash logs.zip

No. I got 70 crashes with mimalloc patch (Comment 21) and 2 CPU affinity (Comment 14).

python ./Tools/Scripts/run-webkit-tests --release --no-new-test-results --no-retry-failures --wincairo -f --iterations=1000 --child-processes=4 fast/text/international/system-language/navigator-language

The crashes happened around IPC. For example,

> #  2  Id: 31d94.2c86c Suspend: 1 Teb: 00000032`911b9000 Unfrozen
>  # Child-SP          RetAddr           Call Site
> 00 (Inline Function) --------`-------- WTF!_mi_heap_delayed_free+0x2d [C:\work\mimalloc\src\page.c @ 284]
> 01 00000032`917ff2e0 00007ffc`6b779149 WTF!_mi_malloc_generic(struct mi_heap_s * heap = 0x000001c4`eff90000, unsigned int64 size = 0x80)+0xc0 [C:\work\mimalloc\src\page.c @ 793]
> 02 00000032`917ff330 00007ffc`6b7e3c1e WTF!WTF::fastMalloc(unsigned int64 n = <Value unavailable error>)+0x9 [S:\gb\Source\WTF\wtf\FastMalloc.cpp @ 202]
> 03 (Inline Function) --------`-------- WTF!WTF::FastMalloc::malloc+0x5 [S:\gb\Source\WTF\wtf\FastMalloc.h @ 197]
> 04 (Inline Function) --------`-------- WTF!WTF::VectorBufferBase<WTF::Function<void __cdecl+0x33 [S:\gb\Source\WTF\wtf\Vector.h @ 292]
> 05 (Inline Function) --------`-------- WTF!WTF::Vector<WTF::Function<void __cdecl+0x50 [S:\gb\Source\WTF\wtf\Vector.h @ 1189]
> 06 00000032`917ff360 00007ffc`6b7e3b81 WTF!WTF::Vector<WTF::Function<void __cdecl(unsigned int64 newMinCapacity = <Value unavailable error>)+0x7e [S:\gb\Source\WTF\wtf\Vector.h @ 1047]
> 07 00000032`917ff3b0 00007ffc`6b7e3953 WTF!WTF::Vector<WTF::Function<void __cdecl(unsigned int64 newMinCapacity = <Value unavailable error>, class WTF::Function<void __cdecl(void)> * ptr = 0x00000032`917ff450)+0x51 [S:\gb\Source\WTF\wtf\Vector.h @ 1060]
> 08 (Inline Function) --------`-------- WTF!WTF::Vector<WTF::Function<void __cdecl+0x10 [S:\gb\Source\WTF\wtf\Vector.h @ 1347]
> 09 (Inline Function) --------`-------- WTF!WTF::Vector<WTF::Function<void __cdecl+0x35 [S:\gb\Source\WTF\wtf\Vector.h @ 780]
> 0a (Inline Function) --------`-------- WTF!WTF::Vector<WTF::Function<void __cdecl+0x35 [S:\gb\Source\WTF\wtf\Vector.h @ 773]
> 0b 00000032`917ff3e0 00007ffc`4b2f500c WTF!WTF::WorkQueue::dispatch(class WTF::Function<void __cdecl(void)> * function = 0x00000032`917ff450)+0x73 [S:\gb\Source\WTF\wtf\win\WorkQueueWin.cpp @ 104]
> 0c 00000032`917ff430 00007ffc`8c68eb1b WebKit2!IPC::Connection::invokeReadEventHandler(void)+0x5c [S:\gb\Source\WebKit\Platform\IPC\win\ConnectionWin.cpp @ 233]
> 0d 00000032`917ff480 00007ffc`8c6905ac ntdll!RtlpTpWaitCallback+0x9b
> 0e 00000032`917ff4f0 00007ffc`8c6941c2 ntdll!TppExecuteWaitCallback+0xa4
> 0f 00000032`917ff540 00007ffc`8c347bd4 ntdll!TppWorkerThread+0x462
> 10 00000032`917ff900 00007ffc`8c6cced1 KERNEL32!BaseThreadInitThunk+0x14
> 11 00000032`917ff930 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Comment 23 Fujii Hironori 2020-04-16 14:12:10 PDT
Created attachment 396697 [details]
crash logs.zip

It's easy to get the crash by turning mimalloc secure feature on (MI_SECURE=4).
The free list was broken. buffer overrun or use-after-free?

> #  6  Id: 3c50.20a0 Suspend: 1 Teb: 000000b8`5e65c000 Unfrozen
>  # Child-SP          RetAddr           Call Site
> 00 000000b8`5eefede0 00007ffc`25776326 ucrtbase!abort+0x4e
> 01 (Inline Function) --------`-------- WTF!mi_error_default+0xb [C:\work\mimalloc\src\options.c @ 348]
> 02 000000b8`5eefee10 00007ffc`25778f9f WTF!_mi_error_message(int err = 0n14, char * fmt = 0x00007ffc`25784e78 "corrupted free list entry of size %zub at %p: value 0x%zx.")+0x186 [C:\work\mimalloc\src\options.c @ 369]
> 03 (Inline Function) --------`-------- WTF!mi_block_next+0xaf [C:\work\mimalloc\include\mimalloc-internal.h @ 613]
> 04 (Inline Function) --------`-------- WTF!_mi_page_thread_free_collect+0xef [C:\work\mimalloc\src\page.c @ 173]
> 05 000000b8`5eeff080 00007ffc`2577570a WTF!_mi_page_free_collect(struct mi_page_s * page = 0x00000646`d0000488, bool force = false)+0x11f [C:\work\mimalloc\src\page.c @ 196]
> 06 000000b8`5eeff0f0 00007ffc`25778bfd WTF!_mi_free_delayed_block(struct mi_block_s * block = 0x00000646`d00dd700)+0x4a [C:\work\mimalloc\src\alloc.c @ 466]
> 07 (Inline Function) --------`-------- WTF!_mi_heap_delayed_free+0x5a [C:\work\mimalloc\src\page.c @ 286]
> 08 000000b8`5eeff120 00007ffc`2577526e WTF!_mi_malloc_generic(struct mi_heap_s * heap = 0x000001c6`cabf0000, unsigned int64 size = 0x60)+0xed [C:\work\mimalloc\src\page.c @ 793]
> 09 (Inline Function) --------`-------- WTF!_mi_page_malloc+0xe [C:\work\mimalloc\src\alloc.c @ 28]
> 0a (Inline Function) --------`-------- WTF!mi_heap_malloc_small+0x1b [C:\work\mimalloc\src\alloc.c @ 66]
> 0b 000000b8`5eeff170 00007ffc`25709149 WTF!mi_heap_malloc(struct mi_heap_s * heap = <Value unavailable error>, unsigned int64 size = <Value unavailable error>)+0x2e [C:\work\mimalloc\src\alloc.c @ 84]
> 0c 000000b8`5eeff1a0 00007ffc`18ac531c WTF!WTF::fastMalloc(unsigned int64 n = <Value unavailable error>)+0x9 [S:\gb\Source\WTF\wtf\FastMalloc.cpp @ 202]
> 0d (Inline Function) --------`-------- WebKit2!IPC::Decoder::operator new+0x17 [S:\gb\Source\WebKit\Platform\IPC\Decoder.h @ 45]
> 0e (Inline Function) --------`-------- WebKit2!std::make_unique+0x17 [C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.25.28610\include\memory @ 2064]
> 0f (Inline Function) --------`-------- WebKit2!WTF::makeUnique+0x17 [S:\gb\WebKitBuild\Release\WTF\Headers\wtf\StdLibExtras.h @ 483]
> 10 000000b8`5eeff1d0 00007ffc`25773e1e WebKit2!IPC::Connection::readEventHandler(void)+0xcc [S:\gb\Source\WebKit\Platform\IPC\win\ConnectionWin.cpp @ 143]
> 11 (Inline Function) --------`-------- WTF!WTF::Function<void __cdecl+0x9 [S:\gb\Source\WTF\wtf\Function.h @ 84]
> 12 (Inline Function) --------`-------- WTF!WTF::WorkQueue::performWorkOnRegisteredWorkThread+0x7a [S:\gb\Source\WTF\wtf\win\WorkQueueWin.cpp @ 62]
> 13 000000b8`5eeff260 00007ffc`5c70f655 WTF!WTF::WorkQueue::workThreadCallback(void * context = 0x00000646`cec267b0)+0x9e [S:\gb\Source\WTF\wtf\win\WorkQueueWin.cpp @ 42]
> 14 000000b8`5eeff2c0 00007ffc`5c7145b4 ntdll!RtlpTpWorkCallback+0x165
> 15 000000b8`5eeff3a0 00007ffc`5b597bd4 ntdll!TppWorkerThread+0x8d4
> 16 000000b8`5eeff760 00007ffc`5c74ce51 KERNEL32!BaseThreadInitThunk+0x14
> 17 000000b8`5eeff790 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Comment 24 Fujii Hironori 2020-04-19 14:46:18 PDT
Created attachment 396926 [details]
Patch to replace m_readBuffer's VectorMalloc with mimalloc

heap corruption seems to happen in m_readBuffer of IPC::Connection.
I replaced m_readBuffer's VectorMalloc with mimalloc (MI_SECURE=4) and got the crashes.

> #  4  Id: 6510.79c8 Suspend: 1 Teb: 000000bb`e4575000 Unfrozen
>  # Child-SP          RetAddr           Call Site
> 00 (Inline Function) --------`-------- WebKit2!mi_ptr_decode+0x9 [C:\work\mimalloc\include\mimalloc-internal.h @ 580]
> 01 (Inline Function) --------`-------- WebKit2!mi_block_nextx+0x9 [C:\work\mimalloc\include\mimalloc-internal.h @ 591]
> 02 (Inline Function) --------`-------- WebKit2!_mi_heap_delayed_free+0x37 [C:\work\mimalloc\src\page.c @ 284]
> 03 000000bb`e4cff6a0 00007ffc`1f615063 WebKit2!_mi_malloc_generic(struct mi_heap_s * heap = 0x0000022d`e4710000, unsigned int64 size = 0x4e4)+0xca [C:\work\mimalloc\src\page.c @ 793]
> 04 (Inline Function) --------`-------- WebKit2!WTF::MiMalloc::malloc+0x8 [S:\gb\WebKitBuild\Release\WTF\Headers\wtf\FastMalloc.h @ 198]
> 05 (Inline Function) --------`-------- WebKit2!WTF::VectorBufferBase<unsigned char,WTF::MiMalloc>::allocateBuffer+0x28 [S:\gb\WebKitBuild\Release\WTF\Headers\wtf\Vector.h @ 292]
> 06 (Inline Function) --------`-------- WebKit2!WTF::Vector<unsigned char,0,WTF::CrashOnOverflow,16,WTF::MiMalloc>::reserveCapacity+0x32 [S:\gb\WebKitBuild\Release\WTF\Headers\wtf\Vector.h @ 1189]
> 07 000000bb`e4cff6f0 00007ffc`1f615497 WebKit2!WTF::Vector<unsigned char,0,WTF::CrashOnOverflow,16,WTF::MiMalloc>::expandCapacity(unsigned int64 newMinCapacity = <Value unavailable error>)+0x63 [S:\gb\WebKitBuild\Release\WTF\Headers\wtf\Vector.h @ 1047]
> 08 (Inline Function) --------`-------- WebKit2!WTF::Vector<unsigned char,0,WTF::CrashOnOverflow,16,WTF::MiMalloc>::grow+0x1d [S:\gb\WebKitBuild\Release\WTF\Headers\wtf\Vector.h @ 1128]
> 09 000000bb`e4cff720 00007ffc`38153dde WebKit2!IPC::Connection::readEventHandler(void)+0x157 [S:\gb\Source\WebKit\Platform\IPC\win\ConnectionWin.cpp @ 124]
> 0a (Inline Function) --------`-------- WTF!WTF::Function<void __cdecl+0x9 [S:\gb\Source\WTF\wtf\Function.h @ 84]
> 0b (Inline Function) --------`-------- WTF!WTF::WorkQueue::performWorkOnRegisteredWorkThread+0x7a [S:\gb\Source\WTF\wtf\win\WorkQueueWin.cpp @ 62]
> 0c 000000bb`e4cff7b0 00007ffc`5c70f655 WTF!WTF::WorkQueue::workThreadCallback(void * context = 0x0000022d`e47fcd60)+0x9e [S:\gb\Source\WTF\wtf\win\WorkQueueWin.cpp @ 42]
> 0d 000000bb`e4cff810 00007ffc`5c7145b4 ntdll!RtlpTpWorkCallback+0x165
> 0e 000000bb`e4cff8f0 00007ffc`5b597bd4 ntdll!TppWorkerThread+0x8d4
> 0f 000000bb`e4cffcb0 00007ffc`5c74ce51 KERNEL32!BaseThreadInitThunk+0x14
> 10 000000bb`e4cffce0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Comment 25 Fujii Hironori 2020-04-19 22:11:15 PDT
The async ReadFile operation should be canceled by using CancelIo
API in the thread called ReadFile before destructing
IPC::Connection.

However, It can't be possible because Windows port WorkQueue
doesn't ensure the same thread.
Comment 26 Fujii Hironori 2020-04-19 22:12:56 PDT
Created attachment 396941 [details]
Patch to leak the vector buffer in Connection::platformInvalidate to avoid heap corruption
Comment 27 Fujii Hironori 2020-04-21 14:35:34 PDT
(In reply to Fujii Hironori from comment #25)
> The async ReadFile operation should be canceled by using CancelIo
> API in the thread called ReadFile before destructing
> IPC::Connection.
> 
> However, It can't be possible because Windows port WorkQueue
> doesn't ensure the same thread.

To fix it, I filed : Bug 210785 – [Win] Use generic WorkQueue instead of WorkQueueWin.cpp
Comment 28 Fujii Hironori 2020-04-22 21:03:11 PDT
Bug 210785 landed at r260477.
Surprisingly, after r260477, even though CancelIo API is not used yet,
I don't observe the heap corruption crash in the internal Jenkins.
And, it can't be reproducible by using 2 CPU affinity (Comment 14).

However, I don't want to close this bug as WORKSFORME because I
believe the async ReadFile operation should be cancel by using
CancelIo API.
Comment 29 Fujii Hironori 2020-04-22 21:04:38 PDT
Created attachment 397312 [details]
WIP Patch
Comment 30 Fujii Hironori 2020-04-23 14:33:47 PDT
Pavel Feldman said:
> @fujihiro we just passed all the tests on WebKit Windows on our bots
> https://github.com/microsoft/playwright/pull/1948/checks?check_run_id=613348018
> So thank you again for making this nasty problem disappear!

PlayWright issue ticket was closed.
https://github.com/microsoft/playwright/issues/680