Bug 231913

Summary: Do not use strerror()
Product: WebKit Reporter: Michael Catanzaro <mcatanzaro>
Component: WebCore Misc.Assignee: Michael Catanzaro <mcatanzaro>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, annulen, benjamin, cdumez, cmarcelo, darin, ddkilzer, ews-watchlist, fpizlo, ggaren, gyuyoung.kim, keith_miller, mark.lam, mcatanzaro, msaboff, ryuan.choi, saam, sam, sergio, tzagallo, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: PC   
OS: Linux   
Attachments:
Description Flags
Patch
none
Patch
ews-feeder: commit-queue-
Patch
ews-feeder: commit-queue-
Patch
ews-feeder: commit-queue-
Patch
none
Patch
ews-feeder: commit-queue-
Patch
ews-feeder: commit-queue-
Patch
none
Patch
none
Patch none

Description Michael Catanzaro 2021-10-18 15:40:33 PDT
strerror() is sadly documented as "MT-Unsafe race:strerror" because it returns strings that point into static memory and it may modify those strings after the function call returns, e.g. to create strings of the form "Unknown error code nnn" which is a real shame. Accordingly, it cannot safely be used anywhere in WebKit. This bug attempts to address all cases outside Source/ThirdParty.

This will not build on Apple ports yet because it adds new files. XCode help would be appreciated.
Comment 1 Michael Catanzaro 2021-10-18 16:23:26 PDT
Created attachment 441655 [details]
Patch
Comment 2 Chris Dumez 2021-10-18 16:33:21 PDT
Comment on attachment 441655 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441655&action=review

> Source/WTF/wtf/SafeStrerror.cpp:52
> +#if OS(WINDOWS)

Can we use std::strerror_s() everywhere? Also, do we really need to store them in a HashMap, can't we just return a CString?
Comment 3 Michael Catanzaro 2021-10-19 05:21:33 PDT
(In reply to Chris Dumez from comment #2)
> > Source/WTF/wtf/SafeStrerror.cpp:52
> > +#if OS(WINDOWS)
> 
> Can we use std::strerror_s() everywhere?

No. A pedantic answer would be "std::strerror_s() doesn't exist because strerror_s() is not part of C++." A better answer: strerror_s() is part of an optional annex in the C11 standard, but glibc does not implement it, so it's not available on Linux. GLib uses it only on Windows, so I assumed that's the only place it's available in practice. But perhaps it's available on Apple platforms? I'm not sure.

> Also, do we really need to store
> them in a HashMap, can't we just return a CString?

We could return a CString, sure. But then we have to modify all callers to expect a CString instead of a char*. I actually considered that before deciding it's easiest not to modify the callers, but either way is fine by me. Considering it again today, using the CString version and getting rid of the map might be a little nicer. I'll change it if you prefer.

P.S. Please help with XCode build. :P
Comment 4 Chris Dumez 2021-10-19 07:43:44 PDT
Created attachment 441723 [details]
Patch
Comment 5 Chris Dumez 2021-10-19 07:48:06 PDT
Created attachment 441724 [details]
Patch
Comment 6 Chris Dumez 2021-10-19 07:59:38 PDT
Created attachment 441725 [details]
Patch
Comment 7 Chris Dumez 2021-10-19 08:00:42 PDT
You are right, strerror_s() doesn't exist for Cocoa ports. I believe I got your patch building on macOS.
Comment 8 Chris Dumez 2021-10-19 08:08:04 PDT
Created attachment 441727 [details]
Patch
Comment 9 Michael Catanzaro 2021-10-19 08:12:36 PDT
Thanks Chris!

I will update the patch to return CString as you suggested. The more I think about it, the more I agree that's better.
Comment 10 Chris Dumez 2021-10-19 08:16:06 PDT
Comment on attachment 441727 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441727&action=review

> Source/WTF/wtf/SafeStrerror.cpp:60
> +    static NeverDestroyed<HashMap<int, CString>> errorsMap WTF_GUARDED_BY_LOCK(lock);

The reason I don't like this is that this is an ever-growing HashMap and a waste of memory. In general, error handling functions are not perf-sensitive I think so we should be able to get away with returning a CString.
It is true you'd need to update the call sites but only to add `.data()`, and you are already updating all call sites anyway.

> Source/WTF/wtf/SafeStrerror.cpp:67
> +    char buffer[2048];

How did you come up with this value? From quick googling, it seems 1024 may suffice?

> Source/WTF/wtf/SafeStrerror.h:4
> + * This library is free software; you can redistribute it and/or

Is this intentionally not using the usual license of the WebKit project?
Comment 11 Chris Dumez 2021-10-19 08:32:51 PDT
Comment on attachment 441727 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441727&action=review

> Source/bmalloc/libpas/src/libpas/pas_page_malloc.c:183
> +            static_assert(std::is_same<decltype(ret), int>::value, "This code expects POSIX strerror_r, not GNU strerror_r");

Oh, I didn't catch that because libpas doesn't build on my system. However, libpas is written in C, not C++ so the use of std::is_same here is not OK.
Comment 12 Michael Catanzaro 2021-10-19 08:46:38 PDT
(In reply to Chris Dumez from comment #10)
> The reason I don't like this is that this is an ever-growing HashMap and a
> waste of memory. In general, error handling functions are not perf-sensitive
> I think so we should be able to get away with returning a CString.
> It is true you'd need to update the call sites but only to add `.data()`,
> and you are already updating all call sites anyway.

Ack.

> > Source/WTF/wtf/SafeStrerror.cpp:67
> > +    char buffer[2048];
> 
> How did you come up with this value? From quick googling, it seems 1024 may
> suffice?

1024 is guaranteed to be enough for glibc. In practice, it's almost surely enough for other libcs. I copied 2048 from g_strerror(), but I will drop it down to 1024. That's an awful lot of characters for an error message, anyway.

> > Source/WTF/wtf/SafeStrerror.h:4
> > + * This library is free software; you can redistribute it and/or
> 
> Is this intentionally not using the usual license of the WebKit project?

Yeah, since it's originally copied from LGPLv2.1+ code. WebKit requires that code use either BSD or LGPLv2.1, and plenty of other files in WTF already use this license, so shouldn't be controversial.

That said, after I remove the HashMap as you suggested, the remaining implementation will be almost entirely different different from g_strerror(). The remaining code from GLib is just a couple lines, which I changed anyway to handle the differences between POSIX vs. GNU strerror_r without requiring a cmake test, so really the only thing remaining from GLib is the strategy of using strerror_s() when building for Windows. So I'm going to say the remaining copying is completely de minimis, and we can use BSD license for this file.

> Oh, I didn't catch that because libpas doesn't build on my system. However, libpas is written in C, not C++ so the use of std::is_same here is not OK.

Oops, not sure how I missed that. Drat.
Comment 13 Michael Catanzaro 2021-10-19 12:14:47 PDT
Comment on attachment 441727 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441727&action=review

> Source/WTF/wtf/SafeStrerror.cpp:36
> +static char* strErrorAdaptor(int errnum, char* buffer, size_t bufferLength)

Why was the separate function needed?

> Source/WTF/wtf/SafeStrerror.cpp:47
> +    // POSIX strerror_r
> +    if (ret)
> +        *buffer = '\0';

Not sure what you were thinking here, but writing NUL to the first position in the buffer of course cuts off the entire message.

That said, I see the POSIX strerror_r is not guaranteed to be NUL-terminated like the GNU version is. Oops. I guess you were trying to fix that to prevent it from crashing, but will need to be done differently.
Comment 14 Chris Dumez 2021-10-19 12:19:08 PDT
(In reply to Michael Catanzaro from comment #13)
> Comment on attachment 441727 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=441727&action=review
> 
> > Source/WTF/wtf/SafeStrerror.cpp:36
> > +static char* strErrorAdaptor(int errnum, char* buffer, size_t bufferLength)
> 
> Why was the separate function needed?

The issue is that even with if constexpr, both branches of the if / else must build successfully. I found it easier to fix the build by moving the logic to a separate function but maybe there is a way to fix your original code without actually needing such function. Your original code didn't build on macOS (because one of the 2 branches was not valid).

> 
> > Source/WTF/wtf/SafeStrerror.cpp:47
> > +    // POSIX strerror_r
> > +    if (ret)
> > +        *buffer = '\0';
> 
> Not sure what you were thinking here, but writing NUL to the first position
> in the buffer of course cuts off the entire message.

But only in case of error. Shouldn't we return an empty string in case of error?

> 
> That said, I see the POSIX strerror_r is not guaranteed to be NUL-terminated
> like the GNU version is. Oops. I guess you were trying to fix that to
> prevent it from crashing, but will need to be done differently.
Comment 15 Michael Catanzaro 2021-10-19 12:22:29 PDT
(In reply to Chris Dumez from comment #14)
> But only in case of error. Shouldn't we return an empty string in case of
> error?

Oh yeah, makes sense. That's better than the ASSERT() I had before.
Comment 16 Michael Catanzaro 2021-10-19 12:39:04 PDT
(In reply to Michael Catanzaro from comment #15) 
> Oh yeah, makes sense. That's better than the ASSERT() I had before.

Let's make it say "unknown error code"
Comment 17 Michael Catanzaro 2021-10-19 14:10:13 PDT
Created attachment 441795 [details]
Patch
Comment 18 Michael Catanzaro 2021-10-19 14:10:35 PDT
Switched it to CString. Let's see if it builds....
Comment 19 Chris Dumez 2021-10-19 14:16:11 PDT
Comment on attachment 441795 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441795&action=review

> Source/WTF/wtf/SafeStrerror.cpp:45
> +    char buffer[1024];

We should be using a CStringBuffer to avoid a string copy.
Comment 20 Michael Catanzaro 2021-10-19 14:44:17 PDT
That would require giving CStringBuffer a public constructor. I don't think it's worth it. If safeStrerror ever shows up in profiling, that's would be weird and would surely indicate a problem with using it too much.
Comment 21 Chris Dumez 2021-10-19 14:52:04 PDT
(In reply to Michael Catanzaro from comment #20)
> That would require giving CStringBuffer a public constructor. I don't think
> it's worth it. If safeStrerror ever shows up in profiling, that's would be
> weird and would surely indicate a problem with using it too much.

Huh? static Ref<CStringBuffer> createUninitialized(size_t length);
Comment 22 Michael Catanzaro 2021-10-19 15:07:11 PDT
(In reply to Chris Dumez from comment #21)
> Huh? static Ref<CStringBuffer> createUninitialized(size_t length);

Well, OK, but it's private and I'd have to make it public. Still don't think it's worth it, but can do it if you really want to.
Comment 23 Michael Catanzaro 2021-10-19 15:09:01 PDT
Created attachment 441805 [details]
Patch
Comment 24 Chris Dumez 2021-10-19 15:09:55 PDT
(In reply to Chris Dumez from comment #21)
> (In reply to Michael Catanzaro from comment #20)
> > That would require giving CStringBuffer a public constructor. I don't think
> > it's worth it. If safeStrerror ever shows up in profiling, that's would be
> > weird and would surely indicate a problem with using it too much.
> 
> Huh? static Ref<CStringBuffer> createUninitialized(size_t length);

Hmm indeed, I am not quite sure why though since there is a public CString constructor that takes in a CStringBuffer*. Seems like it would be as easy as making this static function public?
Comment 25 Michael Catanzaro 2021-10-19 15:15:55 PDT
(In reply to Chris Dumez from comment #24)
> Hmm indeed, I am not quite sure why though since there is a public CString
> constructor that takes in a CStringBuffer*. Seems like it would be as easy
> as making this static function public?

I think it's because you can get the CStringBuffer from an existing CString and use it to create a different CString, since the CStringBuffer is refcounted.
Comment 26 Chris Dumez 2021-10-19 15:44:19 PDT
(In reply to Chris Dumez from comment #24)
> (In reply to Chris Dumez from comment #21)
> > (In reply to Michael Catanzaro from comment #20)
> > > That would require giving CStringBuffer a public constructor. I don't think
> > > it's worth it. If safeStrerror ever shows up in profiling, that's would be
> > > weird and would surely indicate a problem with using it too much.
> > 
> > Huh? static Ref<CStringBuffer> createUninitialized(size_t length);
> 
> Hmm indeed, I am not quite sure why though since there is a public CString
> constructor that takes in a CStringBuffer*. Seems like it would be as easy
> as making this static function public?

Oh, I think we can use CString::newUninitialized()
Comment 27 Michael Catanzaro 2021-10-19 17:17:22 PDT
(In reply to Chris Dumez from comment #26)
> Oh, I think we can use CString::newUninitialized()

For the Windows/strerror_s case, yes. Also yes for the Apple/POSIX strerror_r case. This is an improvement.

For the Linux/GNU strerror_r case, it gets awkward because we have to use the return value of strerror_r, not the buffer we pass to it, so we have to copy the return value back into the CString's buffer, being careful that return value may or may not overlap the buffer, so using normal string handling functions like strcpy/memcpy would be undefined behavior. We have to fall back to memmove, which is surely slower than a copy would have been. But I think that is OK, because none of this is performance-sensitive, and it seems good to optimize for the standard strerror_s/strerror_r rather than optimizing for glibc's seriously messed up strerror_r.
Comment 28 Chris Dumez 2021-10-19 17:45:13 PDT
(In reply to Michael Catanzaro from comment #27)
> (In reply to Chris Dumez from comment #26)
> > Oh, I think we can use CString::newUninitialized()
> 
> For the Windows/strerror_s case, yes. Also yes for the Apple/POSIX
> strerror_r case. This is an improvement.
> For the Linux/GNU strerror_r case, it gets awkward because we have to use
> the return value of strerror_r, not the buffer we pass to it, so we have to
> copy the return value back into the CString's buffer, being careful that
> return value may or may not overlap the buffer, so using normal string
> handling functions like strcpy/memcpy would be undefined behavior. We have
> to fall back to memmove, which is surely slower than a copy would have been.
> But I think that is OK, because none of this is performance-sensitive, and
> it seems good to optimize for the standard strerror_s/strerror_r rather than
> optimizing for glibc's seriously messed up strerror_r.

Yes, I think it is fine to keep the string copy where needed but still good to optimize where we can.
Comment 29 Michael Catanzaro 2021-10-19 18:01:05 PDT
Created attachment 441828 [details]
Patch
Comment 30 Michael Catanzaro 2021-10-19 18:06:23 PDT
Created attachment 441829 [details]
Patch
Comment 31 Chris Dumez 2021-10-19 19:13:51 PDT
Comment on attachment 441829 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441829&action=review

Looks pretty good overall. Just have a few comments still.

> Source/WTF/ChangeLog:8
> +        Add a new safeStrerror function that we can use without safety worries.

without worrying about thread safety.

I think it is good to be clear what what we're worried about. Safety is very vague.

> Source/WTF/wtf/SafeStrerror.cpp:38
> +    const size_t bufferLength = 1024;

constexpr

> Source/WTF/wtf/SafeStrerror.cpp:39
> +    char* cstringBuffer = nullptr;

No need for `= nullptr`.

> Source/WTF/wtf/SafeStrerror.cpp:50
> +        int resultLength = strlen(reinterpret_cast<char*>(ret));

use size_t for resultLength or auto but preferably not int.

> Source/WTF/wtf/SafeStrerror.cpp:51
> +        memmove(cstringBuffer, reinterpret_cast<char*>(ret), resultLength);

Can we avoid doing this work in the (presumably common I assume) case where `reinterpret_cast<char*>(ret) == cstringBuffer`?

> Source/WTF/wtf/SafeStrerror.cpp:52
> +        cstringBuffer[resultLength] = '\0';

Can we memmove `resultLength + 1` bytes and avoid this? The doc says there is always a NUL at the end.

Or actually, maybe we can simply use strncpy() if we only call it when `reinterpret_cast<char*>(ret) != cstringBuffer`.

> Source/WTF/wtf/SafeStrerror.cpp:56
> +            snprintf(cstringBuffer, bufferLength, "%s %d", "Unknown error", errnum);

Should probably be this:
snprintf(cstringBuffer, bufferLength, "Unknown error %d", errnum);

> Source/WTF/wtf/SafeStrerror.h:28
> +#include <wtf/text/CString.h>

<wtf/Forward.h> should suffice.
Comment 32 Michael Catanzaro 2021-10-20 06:15:23 PDT
(In reply to Chris Dumez from comment #31)
> without worrying about thread safety.
> 
> I think it is good to be clear what what we're worried about. Safety is very
> vague.

OK.

> > Source/WTF/wtf/SafeStrerror.cpp:38
> > +    const size_t bufferLength = 1024;
> 
> constexpr

Er, OK, but why? I think it makes no difference?
 
> > Source/WTF/wtf/SafeStrerror.cpp:39
> > +    char* cstringBuffer = nullptr;
> 
> No need for `= nullptr`.

Yeah I know, but this way it will break better if CString::newUninitialized() changes in the future and somehow builds. Maybe pretty unlikely, but uninitialized memory plus pass-by-reference out parameters is scary. So I'd lean towards keeping this.

> > Source/WTF/wtf/SafeStrerror.cpp:50
> > +        int resultLength = strlen(reinterpret_cast<char*>(ret));
> 
> use size_t for resultLength or auto but preferably not int.

Oops, yes.

> > Source/WTF/wtf/SafeStrerror.cpp:51
> > +        memmove(cstringBuffer, reinterpret_cast<char*>(ret), resultLength);
> 
> Can we avoid doing this work in the (presumably common I assume) case where
> `reinterpret_cast<char*>(ret) == cstringBuffer`?

Yes, OK. Without looking at the glibc implementation, I'm not sure how common that really is. Might happen when locale is non-English or when an unknown error code is passed. I think for valid errors codes in English locales, it probably won't use the provided buffer.

> > Source/WTF/wtf/SafeStrerror.cpp:52
> > +        cstringBuffer[resultLength] = '\0';
> 
> Can we memmove `resultLength + 1` bytes and avoid this? The doc says there
> is always a NUL at the end.
> 
> Or actually, maybe we can simply use strncpy() if we only call it when
> `reinterpret_cast<char*>(ret) != cstringBuffer`.

strncpy() is safe in this case if glibc always returns a pointer to the start of the buffer, which it really ought to do, since it would be very weird for it to return a pointer to the middle of the buffer for no reason. The docs don't really say, but I suppose it should be safe to make this assumption.

> > Source/WTF/wtf/SafeStrerror.cpp:56
> > +            snprintf(cstringBuffer, bufferLength, "%s %d", "Unknown error", errnum);
> 
> Should probably be this:
> snprintf(cstringBuffer, bufferLength, "Unknown error %d", errnum);

Ah yes.

> > Source/WTF/wtf/SafeStrerror.h:28
> > +#include <wtf/text/CString.h>
> 
> <wtf/Forward.h> should suffice.

I don't think so, because we return a CString by value, not a pointer, so I would expect the full definition to be required. Maybe I'm wrong. Will try and see if it builds.
Comment 33 Michael Catanzaro 2021-10-20 06:47:49 PDT
Created attachment 441876 [details]
Patch
Comment 34 Michael Catanzaro 2021-10-20 06:52:42 PDT
(In reply to Michael Catanzaro from comment #32)
> I don't think so, because we return a CString by value, not a pointer, so I
> would expect the full definition to be required. Maybe I'm wrong. Will try
> and see if it builds.

You were right!
Comment 35 Chris Dumez 2021-10-20 07:24:37 PDT
Comment on attachment 441876 [details]
Patch

r=me.
Comment 36 EWS 2021-10-20 08:04:34 PDT
Committed r284533 (243275@main): <https://commits.webkit.org/243275@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 441876 [details].
Comment 37 Radar WebKit Bug Importer 2021-10-20 08:05:21 PDT
<rdar://problem/84462085>
Comment 38 Filip Pizlo 2021-10-27 09:01:53 PDT
(In reply to Michael Catanzaro from comment #0)
> strerror() is sadly documented as "MT-Unsafe race:strerror" because it
> returns strings that point into static memory and it may modify those
> strings after the function call returns, e.g. to create strings of the form
> "Unknown error code nnn" which is a real shame. Accordingly, it cannot
> safely be used anywhere in WebKit. This bug attempts to address all cases
> outside Source/ThirdParty.
> 
> This will not build on Apple ports yet because it adds new files. XCode help
> would be appreciated.

Wow, this makes me so sad!  But, understandable.  Thanks for fixing this.