| Summary: | [WTF] Unsafe String concatenation in optimized clang builds | ||
|---|---|---|---|
| Product: | WebKit | Reporter: | Philippe Normand <philn> |
| Component: | WPE WebKit | Assignee: | Philippe Normand <philn> |
| Status: | RESOLVED FIXED | ||
| Severity: | Normal | CC: | bugs-noreply, cgarcia, darin, mcatanzaro, webkit-bug-importer, zan |
| Priority: | P2 | ||
| Version: | WebKit Nightly Build | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
|
Description
Philippe Normand
2022-09-25 07:38:05 PDT
Pull request: https://github.com/WebKit/WebKit/pull/4683 What line of code is crashing? I can’t find the code that is missing a null check. Null strings are supposed to be treated the same as empty strings by this code. With these flags I can reproduce the crash, building with clang 14:
CFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection'
export CFLAGS
CXXFLAGS='-O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection'
export CXXFLAGS
I added log statements:
diff --git a/Source/WTF/wtf/text/StringView.h b/Source/WTF/wtf/text/StringView.h
index 5814931863e9..d1fd709cd910 100644
--- a/Source/WTF/wtf/text/StringView.h
+++ b/Source/WTF/wtf/text/StringView.h
@@ -393,9 +393,11 @@ inline StringView::StringView(const StringImpl& string)
inline StringView::StringView(const StringImpl* string)
{
+ WTFLogAlways("StringImpl 0 %p", string);
if (!string)
return;
+ WTFLogAlways("StringImpl 1 %p", string);
setUnderlyingString(string);
if (string->is8Bit())
initialize(string->characters8(), string->length());
Runtime:
...
StringImpl 0 0x7f7772427000
StringImpl 1 0x7f7772427000
StringImpl 0 0x7f77c41098e0
StringImpl 1 0x7f77c41098e0
StringImpl 0 (nil)
StringImpl 1 (nil)
StringImpl 0 0x7fc9661086a0
StringImpl 1 0x7fc9661086a0
StringImpl 0 0x7fc966118120
StringImpl 1 0x7fc966118120
How are we reaching the "1" call with a nil string?! Could this be a compiler bug?
That's very confusing. I don't know how this could happen. Compilers do like to optimize away nullptr checks like these if they detect undefined behavior, but I'm not sure how that could be happening here. For example, checks for if (this != nullptr) will surely be optimized away. But that's a totally different case than what's happening here. Although it's *probably* unrelated, it reminds me a little of bug #245968, which I reported yesterday, where the point where it's crashing should be unreachable due to a similar check, in that case if (!cache_node). Odd. That one is built with GCC, though. > Could this be a compiler bug? It kinda looks like it. That said, in every similar situation where I previously suspected a GCC or Clang compiler bug, I was always wrong.... (In reply to Philippe Normand from comment #0) > #1 WTF::StringTypeAdapter<WTF::StringImpl*, void>::writeTo<unsigned char>(unsigned char*) const (destination=0x7fa2940e84b0 "", this=<optimized out>) at WTF/Headers/wtf/text/StringConcatenate.h:188 This should test for a non-null StringImpl pointer. Even if the test in the StringView constructor would do the early return, the later call to StringView::getCharactersWithUpconvert() on this object would crash cause the StringView wouldn't have any character data initialized, due to the StringImpl being null to begin with. Clang possibly understands this to the point of eliminating that null check, but it only brings the crash forward anyway. (In reply to Zan Dobersek from comment #5) > (In reply to Philippe Normand from comment #0) > > #1 WTF::StringTypeAdapter<WTF::StringImpl*, void>::writeTo<unsigned char>(unsigned char*) const (destination=0x7fa2940e84b0 "", this=<optimized out>) at WTF/Headers/wtf/text/StringConcatenate.h:188 > > This should test for a non-null StringImpl pointer. Even if the test in the > StringView constructor would do the early return, the later call to > StringView::getCharactersWithUpconvert() on this object would crash cause > the StringView wouldn't have any character data initialized, due to the > StringImpl being null to begin with. Clang possibly understands this to the > point of eliminating that null check, but it only brings the crash forward > anyway. Ah! So the current version of my PR would be correct? I can expand the commit message as well, thanks for the explanation. (In reply to Zan Dobersek from comment #5) > (In reply to Philippe Normand from comment #0) > > #1 WTF::StringTypeAdapter<WTF::StringImpl*, void>::writeTo<unsigned char>(unsigned char*) const (destination=0x7fa2940e84b0 "", this=<optimized out>) at WTF/Headers/wtf/text/StringConcatenate.h:188 > > This should test for a non-null StringImpl pointer. Even if the test in the > StringView constructor would do the early return, the later call to > StringView::getCharactersWithUpconvert() on this object would crash cause > the StringView wouldn't have any character data initialized, due to the > StringImpl being null to begin with. Clang possibly understands this to the > point of eliminating that null check, but it only brings the crash forward > anyway. With a null StringImpl, StringView::getCharactersWithUpconvert() doesn't crash, but it calls into StringImpl::copyCharacters() and causes undefined behavior when calling memcpy() which expects the source destination to be non-null. This memcpy() call still survives, so there's no crashes on other compilers. With Clang 14, the compiler takes memcpy()'s non-null requirements for granted and makes optimizations based on that, to the point where it assumes the StringImpl will never be null because data based on it will be passed to memcpy(), removing the early return in the StringView constructor. This is avoidable by doing an early return in StringImpl::copyCharacters() in case that the numCharacters value is zero (which happens when StringImpl is null and the passed-in length is also zero). (In reply to Zan Dobersek from comment #5) > This should test for a non-null StringImpl pointer. Even if the test in the > StringView constructor would do the early return, the later call to > StringView::getCharactersWithUpconvert() on this object would crash cause > the StringView wouldn't have any character data initialized, due to the > StringImpl being null to begin with. Clang possibly understands this to the > point of eliminating that null check, but it only brings the crash forward > anyway. Why would StringView::getCharactersWithUpconvert crash? StringView supports calls to getCharactersWithUpconvert when the StringView points to null. This concept that "the StringView wouldn't have any character data initialized" doesn't make logical sense. This is a StringView with a character pointer of null and a length of 0. There is no character data to be initialized. (In reply to Zan Dobersek from comment #7) > With a null StringImpl, StringView::getCharactersWithUpconvert() doesn't > crash, but it calls into StringImpl::copyCharacters() and causes undefined > behavior when calling memcpy() which expects the source destination to be > non-null. This memcpy() call still survives, so there's no crashes on other > compilers. Agreed. We should add the null check either in StringView::getCharactersWithUpconvert before calling StringImpl::copyCharacters or in StringImpl::copyCharacters. The compiler is really smart and discovers code later that is using the pointer in way that is invalid if it’s null and then reasons it must not be null and so optimizes out an unnecessary null check. We need to decide that the contract of StringImpl::copyCharacters allows passing a null with a length of zero. If we want it to allow a null, then an if statement needs to go there before calling memcpy. If we want it to require a non-null pointer we should add an assertion and add checks at any calls sites that would otherwise pass null pointers. (In reply to Zan Dobersek from comment #7) > This is avoidable by doing an early return in StringImpl::copyCharacters() > in case that the numCharacters value is zero (which happens when StringImpl > is null and the passed-in length is also zero). I do not think we should do this version. I’d rather use the null pointer check which can sometimes be optimized out by the compiler rather than the length check, which almost certainly can’t be optimized out. One reason I like the idea of putting the null check in StringView::getCharactersWithUpconvert is that it’s just allocated memory so one more null check is *highly* unlikely to have performance effects. Not sure that’s true of all other call sites of StringImpl::copyCharacters. On the other hand, maybe this issue exists at some other call sites? (In reply to Darin Adler from comment #12) > One reason I like the idea of putting the null check in > StringView::getCharactersWithUpconvert is that it’s just allocated memory so > one more null check is *highly* unlikely to have performance effects. > Alright, I'll update the PR then. > Not sure that’s true of all other call sites of StringImpl::copyCharacters. > > On the other hand, maybe this issue exists at some other call sites? I have no evidence of this yet. I wonder though, would it make sense to add an ASSERT(source) in that method, at least immediately before the memcpy() call? (In reply to Philippe Normand from comment #13) > I wonder though, would it make sense to add an ASSERT(source) in that > method, at least immediately before the memcpy() call? Yes, we should add that. Committed 255330@main (c736ca45a340): <https://commits.webkit.org/255330@main> Reviewed commits have been landed. Closing PR #4683 and removing active labels. (In reply to Darin Adler from comment #14) > (In reply to Philippe Normand from comment #13) > > I wonder though, would it make sense to add an ASSERT(source) in that > > method, at least immediately before the memcpy() call? > > Yes, we should add that. https://bugs.webkit.org/show_bug.cgi?id=246267 |