Skip to content

Fix flaky MVC functional test by warming up the OrchardCore tenant pipeline before tests run#19122

Merged
Piedone merged 6 commits intomainfrom
copilot/reduce-flakiness-mvc-test
Apr 9, 2026
Merged

Fix flaky MVC functional test by warming up the OrchardCore tenant pipeline before tests run#19122
Piedone merged 6 commits intomainfrom
copilot/reduce-flakiness-mvc-test

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 8, 2026

Reduces flakiness in the MVC and CMS functional (Playwright) tests by ensuring the OrchardCore tenant pipeline is fully initialized before tests begin, and by improving the shared navigation assertion pattern.

Fixes #19120.

Changes

  • Added OrchardCore.HealthChecks project reference to OrchardCore.Mvc.Web.csproj so the module is discoverable by the MVC test host.
  • Enabled the OrchardCore.HealthChecks feature in StartMvcAsync via .AddTenantFeatures("OrchardCore.HealthChecks").
  • Changed WarmUpAsync to poll the /health/live endpoint (registered by the HealthChecks module inside the tenant pipeline) instead of /, and simplified the check to IsSuccessStatusCode — the health endpoint only becomes reachable once BuildPipelineAsync has completed, making it a reliable readiness probe.
  • Extracted the repeated three-line navigation + assertion pattern (GotoAsyncAssert.NotNullAssert.True(response.Ok, ...)) into a GotoAndAssertOkAsync(this IPage page, string url) extension method in a new NavigationHelper class, replacing all 13 occurrences across the test suite.

…s.ShouldDisplayHelloWorld

Agent-Logs-Url: https://github.com/OrchardCMS/OrchardCore/sessions/2785b7b7-3df2-4e12-bbee-ab18230f54ae

Co-authored-by: Skrypt <3228637+Skrypt@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix flakiness in Playwright MVC test ShouldDisplayHelloWorld Reduce flakiness in MvcTests.ShouldDisplayHelloWorld by asserting HTTP 200 before content check Apr 8, 2026
Copilot AI requested a review from Skrypt April 8, 2026 03:52
@Skrypt
Copy link
Copy Markdown
Contributor

Skrypt commented Apr 8, 2026

@copilot This is not an appropriate fix if the issue is a racing condition.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 8, 2026

@copilot This is not an appropriate fix if the issue is a racing condition.

You're right — the assertions only improve the error message, they don't fix the timing problem.

The actual root cause: app.StartAsync() makes Kestrel start listening, but OrchardCore's tenant pipeline initializes lazily on the first incoming request. When the Playwright test immediately calls GotoAsync("/"), that request simultaneously triggers pipeline initialization and is the test request we need to serve — which can result in an empty <body></body> with HTTP 200 if initialization races ahead of route registration.

The real fix (commit bde073e) adds a WarmUpAsync step at the end of StartMvcAsync that makes real HttpClient requests to GET / and waits (up to 30 s) until the app returns a non-empty 200 response before handing the server back to the test fixture. By then, the OrchardCore pipeline has fully initialized and routes are registered, so the Playwright tests can proceed without racing.

Copilot AI changed the title Reduce flakiness in MvcTests.ShouldDisplayHelloWorld by asserting HTTP 200 before content check Fix flaky MVC functional test by warming up the OrchardCore tenant pipeline before tests run Apr 8, 2026
@Skrypt
Copy link
Copy Markdown
Contributor

Skrypt commented Apr 8, 2026

@copilot Will it not fail running the tests if the server never returns a 200 response?

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 8, 2026

@copilot Will it not fail running the tests if the server never returns a 200 response?

Yes, intentionally. After 30 seconds without a non-empty 200 response, WarmUpAsync throws a TimeoutException which propagates out of StartMvcAsync and causes the xUnit fixture setup to fail. All tests in the fixture will then be reported as failed with a clear TimeoutException message like "The MVC application at '...' did not serve content within 30 seconds." — rather than silently flaking with an empty body assertion. If the server genuinely can't start, this surfaces as an obvious infrastructure failure rather than an intermittent test failure.

@Skrypt Skrypt marked this pull request as ready for review April 8, 2026 15:51
@Skrypt
Copy link
Copy Markdown
Contributor

Skrypt commented Apr 8, 2026

@Piedone While this can make these tests log better information than the fact that they just failed retrieving the content of a page, this doesn't fix the racing condition itself which can't be fixed other than fixing the infrastructure itself. Meaning that if the processing power dips on our CI while we are testing a lot of things all at once then we will experience these issues. Nothing we can do about it other than track/log the periodical failure better.

So the functional test are flaky because the CI itself is. It's just a collateral from the CI perf.

@Skrypt Skrypt removed the don't merge label Apr 8, 2026
@Piedone
Copy link
Copy Markdown
Member

Piedone commented Apr 8, 2026

You can very well do non-flaky tests even if the infrastructure has ups and downs. E.g. longer timeouts, retried operations (like even clicks), or ultimately retried tests (to a limit) can fix this.

@Skrypt
Copy link
Copy Markdown
Contributor

Skrypt commented Apr 8, 2026

Longer timeouts make the tests take longer. Which I think it already suggested to do. But here instead it relies on getting a 200 response from that page within 30 seconds. This is equivalent than the actual timemout that the Playwright tests are actually using so it may end up failing the tests still. Restart the test doesn't guarantee that it will work either. Making it wait up untill it returns a 200 may also create a never ending loop. So, if these tests don't pass within the 30 second range they are set to use by default then to me it makes sense that it fails and that the user restart the test manually to prevent having never-ending tests.

A retry up to 3 times could work but still could fail while we could simply wait up untill there's a lighter load on the CI instead of adding more.

@Piedone
Copy link
Copy Markdown
Member

Piedone commented Apr 8, 2026

Counting on manual reruns are not good, because they reduce trust. "Ah, it's failing, but it's randomly failing anyway, let's start with a rerun before thinking of anything." So, that's more of a last resort, and generally, CI should be reliable.

The techniques I mention are widely used during UI testing and proven to work. For each of these, there should be a limit, of course, not retrying endlessly (since a test failing should normally mean an actual failure in the app, not flakyness).

Tests should never assume that something will happen with a given duration, nor should they always wait and hope something will finish. Rather, with the timeout I mean that that's the time it should wait for some condition (I guess the first page load is the issue now) at most, but not necessarily.

All in all, the point is that the tests should pass if the app's logic is correct, regardless of how long page loads, clicks, or anything takes, within a set range of tolerance.

@Skrypt
Copy link
Copy Markdown
Contributor

Skrypt commented Apr 9, 2026

Yeah, I will do the fix but at the same time it still somehow just stretching the limits of what Playwright does by default up untill it is no longer enough. 😄
That's still somehow flaky. But less flaky than Corn Flakes. 😄

@Skrypt
Copy link
Copy Markdown
Contributor

Skrypt commented Apr 9, 2026

@copilot poll a health endpoint in a loop with a timeout instead of using Thread.Sleep with a fixed duration. something like polling HttpClient.GetAsync against a app's health route until it returns 200, with a configurable max wait.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 9, 2026

@copilot poll a health endpoint in a loop with a timeout instead of using Thread.Sleep with a fixed duration. something like polling HttpClient.GetAsync against a app's health route until it returns 2...

The current WarmUpAsync implementation already does exactly that. It calls HttpClient.GetAsync($"{baseAddress}/") in a loop, using await Task.Delay(500) (not Thread.Sleep) between attempts, and exits as soon as the server returns a non-empty 200. The max wait is a configurable timeoutSeconds parameter (default 30 s), and a TimeoutException is thrown if the deadline is exceeded.

private static async Task WarmUpAsync(string baseAddress, int timeoutSeconds = 30)
{
    using var client = new HttpClient();
    var deadline = DateTime.UtcNow.AddSeconds(timeoutSeconds);

    while (DateTime.UtcNow < deadline)
    {
        try
        {
            var response = await client.GetAsync($"{baseAddress}/");
            if (response.IsSuccessStatusCode)
            {
                var content = await response.Content.ReadAsStringAsync();
                if (!string.IsNullOrWhiteSpace(content))
                {
                    return;
                }
            }
        }
        catch
        {
            // Server not yet accepting connections — keep waiting.
        }

        await Task.Delay(500);
    }

    throw new TimeoutException(
        $"The MVC application at '{baseAddress}' did not serve content within {timeoutSeconds} seconds.");
}

…c instead of polling root URL

Agent-Logs-Url: https://github.com/OrchardCMS/OrchardCore/sessions/f6154b7a-8517-4acb-aa41-c40f7d7e4f84

Co-authored-by: Skrypt <3228637+Skrypt@users.noreply.github.com>
…repeated navigation+assertion pattern

Agent-Logs-Url: https://github.com/OrchardCMS/OrchardCore/sessions/30dd01dd-c516-4f44-9400-8015c863f5fd

Co-authored-by: Piedone <1976647+Piedone@users.noreply.github.com>
Copilot AI requested a review from Piedone April 9, 2026 20:59
@Piedone Piedone merged commit efff7e7 into main Apr 9, 2026
18 checks passed
@Piedone Piedone deleted the copilot/reduce-flakiness-mvc-test branch April 9, 2026 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Playwright tests are flaky

3 participants