There are two kinds of flaky/intermittent tests in Servo. The
traditional kind is the test that fails on the CI, but has an associated
bug indicating that the test is an intermittent failure. Many of these
tests have completely unstable results, for instance those where an
unpredictable set of subtests fail. It's impossible to generate stable
results for these, so we have traditionally simply discard these
unexpected results.
Another kind of intermittent test is one that will produce an expected
result when rerun (ie will flake). Some of these are also labeled with
bugs, while some are not. In some cases, there is flakiness in some core
Servo functionality that can lead to *any* test flaking, such as a race
condition that can lead to an early screenshot for reftests. When these
kinds of tests do not have associated bugs, they cause the CI to fail.
In this case, it is impossible to label these tests as intermittent
because it can literally be any test.
This change, reruns failed tests in order to detect unlabeled tests in
the second category. Instead of blocking the CI when the second run
leads to expected results, the CI will now pass, but the flake will be
reported to the new flakiness dashboard. This prevents unrelated flakes
from slowing down the merge queue.
Use the new intermittent dashboard to report intermittents and get
information about open bugs. This is now used to filter out
known-intermittents from results. In addition, this also allows the
scripts to report bug information to the GitHub. Display that in all
output.
After filtering intermittents, output the results as JSON. Update the
GitHub workflow to aggregate this JSON data into an artifact and use the
aggregated data to generate a GitHub comment with details about the try
run. The idea here is that this comment will make it easier to track
intermittent tests and notice when a change affects a test marked as
intermittent -- either causing it to permanently fail or fixing it.
GitHub supports adding files to an artifact in parallel, as long as the
filenames are unique. This makes it easier to download build results
when more than a single builder fails.
The GitHub Action needs access to Servo repository secrets, so switch to
using the 'pull_request_target' event. Since these PRs have more
complete access to the Servo repository, do not execute the version of
the upstream script that comes with the PR. Instead, simply fetch the
changes. To make this work, the script no longer expects the PR commit
to be checked out, merely that they exist in the repository somewhere.
This is a modified version of the webhook found at
servo/upstream_wpt_webhook and deployed via SaltStack. It is updated to
use modern Python and to assume that GitHub Actions will fetch the
appropriate source code locally before the script is run.
Fixes#29206.
Fixes#23798.
Signed-off-by: Martin Robinson <mrobinson@igalia.com>