We should have the tokenizer use a trie or something like that
Note that HTML allows whatever tag name to be used regardless of it matches a built in element name or not, and case sensibility depends on the type of document and element being parsed. Also note a single tag name like script can refer to two distinct qualified names in different namespaces (SVG and HTML in this case)
@Chris - Is this still applicable?
Right, this was already implemented a while back.