RESOLVED FIXED 106919
Make AtomicMarkupTokenBase use a bare UChar* for external characters
https://bugs.webkit.org/show_bug.cgi?id=106919
Summary Make AtomicMarkupTokenBase use a bare UChar* for external characters
Tony Gentilcore
Reported 2013-01-15 10:06:07 PST
Make AtomicMarkupTokenBase use a bare UChar* for external characters
Attachments
Patch (5.32 KB, patch)
2013-01-15 10:07 PST, Tony Gentilcore
no flags
Tony Gentilcore
Comment 1 2013-01-15 10:07:26 PST
Eric Seidel (no email)
Comment 2 2013-01-15 10:35:36 PST
Comment on attachment 182800 [details] Patch ok.
WebKit Review Bot
Comment 3 2013-01-15 11:07:51 PST
Comment on attachment 182800 [details] Patch Clearing flags on attachment: 182800 Committed r139760: <http://trac.webkit.org/changeset/139760>
WebKit Review Bot
Comment 4 2013-01-15 11:07:54 PST
All reviewed patches have been landed. Closing bug.
Benjamin Poulain
Comment 5 2013-01-15 23:33:12 PST
Comment on attachment 182800 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=182800&action=review > Source/WebCore/xml/parser/MarkupTokenBase.h:529 > + size_t charactersLength() const > + { > + ASSERT(m_type == Token::Type::Character); > + return m_externalCharactersLength; > } While you are at it: this would be better as unsigned instead of size_t. Character buffers are usually unsigned (WTFString included). > Source/WebCore/xml/parser/MarkupTokenBase.h:588 > + size_t m_externalCharactersLength; ditto.
Tony Gentilcore
Comment 6 2013-01-16 10:37:44 PST
Comment on attachment 182800 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=182800&action=review >> Source/WebCore/xml/parser/MarkupTokenBase.h:529 >> } > > While you are at it: this would be better as unsigned instead of size_t. > Character buffers are usually unsigned (WTFString included). abarth tells me that on a 64bit system, unsigned will still be 32 bits but size_t will be 64. Since Vectors are arbitrary length, using size_t is appropriate here.
Adam Barth
Comment 7 2013-01-16 11:02:15 PST
I wonder if we should crash if we get a character token that's larger than 32 bits. I bet we do later anyway due to overflow checks in StringImpl.
Benjamin Poulain
Comment 8 2013-01-16 13:44:21 PST
> abarth tells me that on a 64bit system, unsigned will still be 32 bits but size_t will be 64. That is correct. > Since Vectors are arbitrary length, using size_t is appropriate here. Well, that's all nice when creating the Token, but the method you call will get you un trouble. E.g.: String characters = String(token->characters(), token->charactersLength()); Your patch changed from using Vector types to Character types, which is why I suggest we move to string conventions. > I wonder if we should crash if we get a character token that's larger than 32 bits. I bet we do later anyway due to overflow checks in StringImpl. Yep, I think you are right. Seeing what is here, I think that's the way to make this correct.
Adam Barth
Comment 9 2013-01-16 14:22:24 PST
Yeah. Would you be willing to file a bug about this issue? The issue pre-dates this patch since the old code used Vector::size as well.
Note You need to log in before you can comment on or make changes to this bug.