Bug 236997 - [WTF] LikelyDenseUnsignedIntegerSet::add can cause a very slow reindexing of the entire bit vector with every call in the worst case
Summary: [WTF] LikelyDenseUnsignedIntegerSet::add can cause a very slow reindexing of ...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Robin Morisset
URL:
Keywords: InRadar
Depends on:
Blocks: 236269
  Show dependency treegraph
 
Reported: 2022-02-21 15:11 PST by Robin Morisset
Modified: 2022-03-08 18:58 PST (History)
11 users (show)

See Also:


Attachments
Patch (12.14 KB, patch)
2022-02-21 15:17 PST, Robin Morisset
saam: review+
ews-feeder: commit-queue-
Details | Formatted Diff | Diff
Patch (12.98 KB, patch)
2022-03-07 14:00 PST, Robin Morisset
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Robin Morisset 2022-02-21 15:11:25 PST
This problem was found while investigating https://bugs.webkit.org/show_bug.cgi?id=236269.

LikelyDenseUnsignedIntegerSet has two forms: either as a HashSet (if sparse) or as a BitVector representing numbers above some value m_min (if sufficiently dense).
This is a massive memory win in most situations (>4x in practice for register allocation in JS2, >20x on some pathological webpages).
But it means that when adding a value below m_min to a set in BitVector shape, we have to rebuild the whole set, which takes a time proportional to the time of the set.
So if building a set by repeatedly adding decreasing values (like in https://bugs.webkit.org/show_bug.cgi?id=236269 where we add 10000, then 9999, then 9998, etc..), we have some awful performance.

In this patch I improve this situation in two ways:
- First I always round down m_min to the next multiple of 64. This means that when adding contiguous values like above we only re-index once every 64 adds.
- It then allows me to do the reindexing by simple memcpy instead of costly iteration of all the set bits, since they are now always at the same offset within the words of the BitVector.

On an M1 MBP (2019), on testair:: testZDefOfSpillSlotWithOffsetNeedingToBeMaterializedInARegister, with n=5000, in release mode, measuring just the part spent building the interference graph:
Before this patch: 107 s
After this patch: 77 ms (note the different unit, it is not a typo!)

Unfortunately, it does not seem to significantly improve the time spent in LikelyDenseUnsignedIntgerSet::add in JetStream2, probably because the pattern of always adding a value just before the minimum is quite pathological/rare. I still think it is worth landing, as we don't know what code out there may hit this performance problem.
Comment 1 Robin Morisset 2022-02-21 15:17:22 PST
Created attachment 452784 [details]
Patch
Comment 2 Saam Barati 2022-02-21 18:23:39 PST
Comment on attachment 452784 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=452784&action=review

r=me, this looks good. I wonder if there's something even more we can do here, that's less divide run time by 64, and perhaps more along the lines of "double memory each time you grow a vector" amortization. I'm not sure we need to go too crazy right now, but just food for thought. What if we were able to instead divide our minimum by 2 (or 4, or similar) each time, if we have reason to believe the set will stay dense.

> Source/WTF/ChangeLog:19
> +        On an M1 MBP (2019), on testair:: testZDefOfSpillSlotWithOffsetNeedingToBeMaterializedInARegister, with n=5000, in release mode, measuring just the time spent building the interference graph:

I think you meant 2020 here.

> Source/WTF/wtf/BitVector.cpp:110
> +        ASSERT(shiftInWords + 1 <= newNumWords);

RELEASE_ASSERT?

> Source/WTF/wtf/BitVector.cpp:117
> +            ASSERT(shiftInWords + oldNumWords <= newNumWords);

RELEASE_ASSERT?

> Source/WTF/wtf/LikelyDenseUnsignedIntegerSet.h:164
> +            m_inline.bitVector.shiftRightByMultipleOf64(m_min - newMin);

nit: ASSERT that newMin < m_min?
Comment 3 Jonathan Bedard 2022-02-22 08:21:00 PST
3 failures in https://ews-build.webkit.org/#/builders/70/builds/972 are known, stopped the build before retries.
Comment 4 Radar WebKit Bug Importer 2022-02-28 15:12:32 PST
<rdar://problem/89584035>
Comment 5 Robin Morisset 2022-03-07 14:00:01 PST
Created attachment 454029 [details]
Patch
Comment 6 Robin Morisset 2022-03-08 16:49:50 PST
Comment on attachment 454029 [details]
Patch

Trying to land this, as all bots are green except for iOS-wk2 which is visibly broken for all patches.
Comment 7 EWS 2022-03-08 18:58:00 PST
Committed r291027 (248201@main): <https://commits.webkit.org/248201@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 454029 [details].