RESOLVED FIXED 231607
REGRESSION (iOS 15): Tab crashes when trying to render Projector stories
https://bugs.webkit.org/show_bug.cgi?id=231607
Summary REGRESSION (iOS 15): Tab crashes when trying to render Projector stories
Sidhu Alluri
Reported 2021-10-12 12:37:08 PDT
The following link crashes the iOS 15 Safari tab on iPhone X, 12 Mini, and 13. Works fine in all devices with iOS 14 or earlier. https://staging.projector.com/story/be1ca8c7-a46a-41fc-9ad6-de8b8f178a8b?scene=6f3e4207 After reloading once, the tab displays "A problem repeatedly occurred on https://staging.projector.com/story/be1ca8c7-a46a-41fc-9ad6-de8b8f178a8b?scene=6f3e4207" No console errors when trying to debug.
Attachments
Patch (4.59 KB, patch)
2021-10-28 17:17 PDT, Kyle Piddington
no flags
Patch (8.39 KB, patch)
2021-10-29 20:03 PDT, Kyle Piddington
no flags
Patch for landing (8.33 KB, patch)
2021-11-03 12:31 PDT, Kyle Piddington
no flags
Kenneth Russell
Comment 1 2021-10-12 14:16:43 PDT
I only have the hardware to test this on macOS but the context creation attributes appear to be: alpha: false antialias: false powerPreference: "high-performance" premultipliedAlpha: true preserveDrawingBuffer: false stencil: false Not sure why this combination might cause the tab to crash. Kimmo, would it be possible for you to confirm this? It doesn't look like the xrCompatible context creation attribute is the issue here.
Sidhu Alluri
Comment 2 2021-10-12 18:41:41 PDT
Managed to get it to stop crashing after reducing the gl buffer size to 15MB (down from 96MB, we use a single large buffer). `createBuffer` call succeeds but leads to eventual tab crash after enough draw calls. I'll try to narrow it further, but wanted to give an update.
Sidhu Alluri
Comment 3 2021-10-13 08:21:51 PDT
(In reply to Sidhu Alluri from comment #0) > The following link crashes the iOS 15 Safari tab on iPhone X, 12 Mini, and > 13. Works fine in all devices with iOS 14 or earlier. > https://staging.projector.com/story/be1ca8c7-a46a-41fc-9ad6- > de8b8f178a8b?scene=6f3e4207 > > After reloading once, the tab displays "A problem repeatedly occurred on > https://staging.projector.com/story/be1ca8c7-a46a-41fc-9ad6- > de8b8f178a8b?scene=6f3e4207" > > No console errors when trying to debug. I'm deploying a lower sized gl buffer to work around the issue but I added a query param to manually set the buffer size. The issue can still be repro'd via this link https://staging.projector.com/story/be1ca8c7-a46a-41fc-9ad6-de8b8f178a8b?scene=6f3e4207&glBufferSize=100663296
Radar WebKit Bug Importer
Comment 4 2021-10-13 17:18:52 PDT
Kimmo Kinnunen
Comment 5 2021-10-15 07:26:33 PDT
This appears to be due to WebContent process running over memory limit. It's unclear if it's Metal or GPU process difference to iOS 14.
Kimmo Kinnunen
Comment 6 2021-10-22 07:05:59 PDT
This is mtl::Buffer instances being leaked due to buffer pool creating them unbounded
Kyle Piddington
Comment 7 2021-10-22 12:47:09 PDT
Looking into this bug: It seems there’s some trouble at work with the per-frame data. I see the initial 96 MB allocation, but not any repetition of that initial allocation. Each frame, however, I'm seeing about 38 more MB of allocated data. We seem to hold on to the first few frames worth of data, and don't get around to clearing them, however, which I think is leading to this bug.
Kyle Piddington
Comment 8 2021-10-22 12:54:15 PDT
From analysis: We're creating a lot more buffer pools than we are buffers. It seems the issue isn't one buffer pool allocating too much, but too many buffer pools being created in the first place.
Le Hoang Quyen
Comment 9 2021-10-22 18:12:54 PDT
Sorry, the vertex conversion's buffer pool mechanism in the Metal back-end was originally written by me. So I did attempt to trace the Metal commands while running this website in order to know what's going on. Here what I observed: - The website created a single large buffer (roughly 100MB) then seemingly used many small portions of it for different draw calls: - portion from 0-100 for draw call 1 (only needs 10 vertices, this is an approximate number). - portion from 100-200 for draw call 2 (only needs 10 vertices). - ... - portion from 90000000-100000000 for draw calls n (only needs 10 vertices). - Due to irregular offsets, many of the portions above needed vertex attribute conversions. However the conversions were inefficient because they converted the whole buffer's worth of vertices starting from given offset to the end. - draw call 1 allocated a conversion buffer for 90000000 vertices' attributes even though it needs only 10 vertices. - draw call 2 allocated a conversion buffer for 80000000 vertices' attributes even though it needs only 10 vertices. - ... The problem is that the conversion buffer pool will always allocate a new sub-buffer to store the converted attribute values for a number of vertices starting from given offset to the end regardless of how many vertices the actually draw call needs. Leading to a lot of converted portions unused by any draw calls. Currently the conversion buffers are keyed by triplet {format, offset, stride}. It doesn't take into account how many vertices needed hence the issue above happened. The number of vertices would be included in the key i.e. {format, offset, stride, numVertices}. However, it would still lead to another problem in another use case: - users draw using a same offset but with different number of vertices many times => many conversion buffers would be created.
Kimmo Kinnunen
Comment 10 2021-10-25 06:34:11 PDT
(In reply to Le Hoang Quyen from comment #9) > Currently the conversion buffers are keyed by triplet {format, offset, > stride}. It doesn't take into account how many vertices needed hence the > issue above happened. > The number of vertices would be included in the key i.e. {format, offset, > stride, numVertices}. However, it would still lead to another problem in > another use case: > - users draw using a same offset but with different number of vertices many > times => many conversion buffers would be created. Thanks for the explanation! I think the general logic should be better. However, for this particular bug, it sounds like we need to just change the look up function from `offset == wantOffset` to `offset <= wantOffset` ?
Le Hoang Quyen
Comment 11 2021-10-25 07:30:32 PDT
(In reply to Kimmo Kinnunen from comment #10) > (In reply to Le Hoang Quyen from comment #9) > > Currently the conversion buffers are keyed by triplet {format, offset, > > stride}. It doesn't take into account how many vertices needed hence the > > issue above happened. > > The number of vertices would be included in the key i.e. {format, offset, > > stride, numVertices}. However, it would still lead to another problem in > > another use case: > > - users draw using a same offset but with different number of vertices many > > times => many conversion buffers would be created. > > Thanks for the explanation! I think the general logic should be better. > > However, for this particular bug, it sounds like we need to just change the > look up function from `offset == wantOffset` to `offset <= wantOffset` ? The above offsets are just approximate numbers. The actual numbers might not be multiples of 100. And each drawing portions might use different alignments from each other. Yes, the lookup could be changed to something better (perhaps using `offset mod alignment` as key instead of offset). However, `offset <= wantOffset` alone doesn't work because there are several limits of Metal I noticed: - The final offset of the vertex attribute is calculated by: final offset = bufferOffset + attribOffset. - `bufferOffset` = offset set via MTLRenderCommandEncoder.setVertexBuffer - `attribOffset` = MTLVertexAttributeDescriptor's offset field. - Metal requires: - `bufferOffset` to be multiples of the attribute's data type's alignment. - `attribOffset` must be multiples of 4. - Furthermore, `attribOffset` + attribute's size must be less than or equal to the fetching stride.
Kyle Piddington
Comment 12 2021-10-28 17:17:28 PDT
EWS Watchlist
Comment 13 2021-10-28 17:18:32 PDT
Note that there are important steps to take when updating ANGLE. See https://trac.webkit.org/wiki/UpdatingANGLE
Darin Adler
Comment 14 2021-10-28 17:31:22 PDT
Comment on attachment 442768 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=442768&action=review > Source/ThirdParty/ANGLE/src/libANGLE/renderer/metal/VertexArrayMtl.mm:1079 > + VertexConversionBufferMtl * vertexConverison = (VertexConversionBufferMtl *)conversion; Spelling error "conversion".
Kyle Piddington
Comment 15 2021-10-29 20:03:06 PDT
Dean Jackson
Comment 16 2021-11-01 15:37:21 PDT
Comment on attachment 442895 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=442895&action=review > Source/ThirdParty/ANGLE/src/libANGLE/renderer/metal/VertexArrayMtl.mm:1074 > + VertexConversionBufferMtl * vertexConverison = (VertexConversionBufferMtl *)conversion; Typo here: vertexConverison
Kyle Piddington
Comment 17 2021-11-03 12:31:09 PDT
Created attachment 443223 [details] Patch for landing
EWS
Comment 18 2021-11-03 13:10:28 PDT
Committed r285220 (243845@main): <https://commits.webkit.org/243845@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 443223 [details].
Note You need to log in before you can comment on or make changes to this bug.