252269 – Match upstream WPT semantics for testharness tests

NEW 252269

Match upstream WPT semantics for testharness tests

https://bugs.webkit.org/show_bug.cgi?id=252269

Summary Match upstream WPT semantics for testharness tests

Sam Sneddon [:gsnedders]

Reported 2023-02-14 15:09:54 PST

We know that a disproportionate amount of flakiness in LayoutTests comes from WPT; some degree of this is simply going to be the rate at which tests are added there versus the rest of LayoutTests. But we also break the testharness API guarantees by expecting more from them by default: we expect what we log in the console to be constant (bug 161310), we expect assertion messages to be constant (bug 161693), and we expect the order of subtests to be constant (bug 252268).

Attachments
Add attachment proposed patch, testcase, etc.

Radar WebKit Bug Importer

Comment 1 2023-02-14 15:11:16 PST

<rdar://problem/105469744>

Sam Sneddon [:gsnedders]

Comment 2 2023-02-14 15:18:24 PST

From bug 161693, though all of these really relevant to the wider issue: (In reply to Darin Adler from comment #5) > I’m not going to say review+ because I am not sure this is the best solution. > > The cost of this is that when a test does fail, we won’t be able to see > which assertion failed. And if we turn the real failures back on then we > will get failures because of expected files that expect the failure log to > be disabled. > > So is that cost worth the benefit of less flakiness? Is there some other way > of achieving that goal? Maybe we can adjust things so that the code that > compares the results with expected results can understand what may vary? (In reply to Chris Dumez from comment #7) > r=me with comments. For the record, I do think the best solution would be to > avoid such flaky asserts upstream in web-platform-tests. However, based on > feedback, I think this is unlikely to happen. As a result, I think this > patch is the second best option and we can do this easily and quickly. This > is MUCH better than skipping tests. (In reply to Alex Christensen from comment #8) > I think ideally we wouldn't need this because all the tests would always > have deterministic output. Since that isn't the case, we need something > like this. We should bring it up when we discuss this in a few weeks.

Sam Sneddon [:gsnedders]

Comment 3 2023-02-15 13:13:34 PST

When it comes to flakiness, there's also the extreme solution that Chromium took: If the harness status is OK, and all subtests are PASS, and there's no console output, don't bother with an expectation file whatsoever and just treat that test as passing.

Kiet Ho

Comment 4 2025-05-29 14:47:28 PDT

One source of flakiness: WKTR prints "Blocked access to external URL ...", which gets included in the test output to be compared. But the order of these messages can be arbitrary and causes flakiness, even though all subtests PASS.

Note You need to log in before you can comment on or make changes to this bug.

Status NEW

Resolution

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Nobody

Reported

2023-02-14 15:09 PST

Modified

2025-05-29 14:47 PDT History

CC List

6 users Show

URL

Keywords InRadar

Depends on

161693 252268 161310

Blocks

Dependencies

tree graph