Fix Flaky Tests with Pest Repeat

Fix Flaky Tests with Pest Repeat

I recently ran into an issue with a flaky test in our CI process. Most of the time, it would pass, but when it failed, it meant running all the tests again and hoping it would pass on the next try.

When I was finally fed up enough with the waiting, I would decide to run the test locally, and when I would see it pass, I’d be confused about what was going on in CI.

That’s when I started using the repeat method with Pest to debug what was going on.

I will use a simple Post model as an example of what I was running into.

public function up(): void
{
    Schema::create('posts', function (Blueprint $table) {
        $table->id();
        $table->string('title');
        $table->string('content')->nullable();
        $table->unsignedBigInteger('user_id');
        $table->string('status');
        $table->dateTime('published_at')->nullable();
        $table->timestamps();
    });
}
class Post extends Model
{
    use HasFactory;

    protected $casts = [
        'status' => PostStatus::class,
        'published_at' => 'datetime',
    ];

    protected $attributes = [
        'status' => PostStatus::Idea,
    ];

    protected function isPublished(): Attribute
    {
        return Attribute::make(
            get: function () {
                return $this->status === PostStatus::Published && $this->published_at?->lt(now());
            }
        );
    }
}

PostStatus is an enum with the following cases:

enum PostStatus: string
{
    case Idea = 'idea';
    case Draft = 'draft';
    case Published = 'published';
    case Hidden = 'hidden';
}

So far, everything is pretty straightforward. A post is considered published if it has a status of PostStatus::Published and a published_at date before the current date/time. I have two simple tests set up for this.

uses(RefreshDatabase::class);

it('is published', function () {
    $post = Post::factory()->create([
        'status' => PostStatus::Published
    ]);

    expect($post->is_published)->toBeTrue();
});

it('is not published', function () {
    $post = Post::factory()->create();

    expect($post->is_published)->toBeFalse();
});

I run the tests locally and everything passes.

Flaky Tests Passing

Everything’s looking good, however, when merging my PR, CI runs and one of these tests fails. Or even worse, it passes, and another teammate comes along with a new PR and one of these tests now fails even though nothing related to these tests was changed. What just happened?

I decide I need to fix this flaky test, so I start looking into it using the repeat method.

it('is published', function () {
    $post = Post::factory()->create([
        'status' => PostStatus::Published
    ]);

    expect($post->is_published)->toBeTrue();
})->repeat(100);

it('is not published', function () {
    $post = Post::factory()->create();

    expect($post->is_published)->toBeFalse();
})->repeat(100);

Now, when I run the tests, I see something like the following:

Repeated Flaky Tests Not Passing

So, our tests pass most of the time, but not all the time. So we know something is going on here. Since these tests are fairly simple, the first thing I will check is the factory.

class PostFactory extends Factory
{
    protected $model = Post::class;

    public function definition(): array
    {
        return [
            'title' => $this->faker->word(),
            'content' => $this->faker->paragraph(),
            'status' => $this->faker->randomElement(PostStatus::cases()),
            'published_at' => $this->faker->dateTimeBetween(now()->subYear(), now()->addWeek()),
            'user_id' => User::factory(),
        ];
    }
}

Look at that, we are randomly picking a status in the factory as well as a date, that could be in the past or the future. So, depending on the randomly picked selections, the test can fail. For this example, the best thing to do would probably be to modify the factory to take out the randomness, and then various adjustments into states, like the following.

class PostFactory extends Factory
{
    protected $model = Post::class;

    public function definition(): array
    {
        return [
            'title' => $this->faker->word(),
            'status' => PostStatus::Idea,
            'user_id' => User::factory(),
        ];
    }

    public function published(): static
    {
        return $this->state(function () {
            return [
                'status' => PostStatus::Published,
                'published_at' => now()->subWeek(),
            ];
        });
    }

    public function scheduled(): static
    {
        return $this->state(function () {
            return [
                'status' => PostStatus::Published,
                'published_at' => now()->addWeek(),
            ];
        });
    }

    public function idea(): static
    {
        return $this->state(function () {
            return [
                'content' => null,
                'status' => PostStatus::Idea,
                'published_at' => null,
            ];
        });
    }

    public function draft(): static
    {
        return $this->state(function () {
            return [
                'status' => PostStatus::Draft,
                'published_at' => null,
            ];
        });
    }

    public function hidden(): static
    {
        return $this->state(function () {
            return [
                'status' => PostStatus::Hidden,
                'published_at' => now()->addWeek(),
            ];
        });
    }
}

Notice that I changed the base definition to be much simpler with no randomness and then added the various states I would expect. For large projects where a factory may already be used in many places, you may need to be careful changing the base definition because it could break other tests.

With the factory updates in place, I updated the tests to the following:

uses(RefreshDatabase::class);

it('is published', function () {
    $post = Post::factory()->published()->create();

    expect($post->is_published)->toBeTrue();
})->repeat(100);

it('is not published', function ($post) {
    expect($post->is_published)->toBeFalse();
})->with([
    'idea post' => fn () => Post::factory()->idea()->create(),
    'draft post' => fn () => Post::factory()->draft()->create(),
    'scheduled post' => fn () => Post::factory()->scheduled()->create(),
    'hidden post' => fn () => Post::factory()->hidden()->create(),
])->repeat(100);

Now when I run them, I get the output below.

Repeated Tests Passing

Since the second test is now using a dataset, it repeats 100 times for each item in the dataset, which is why there are 500 assertions versus the original 200.

Now that I am confident with the tests, I can remove the repeat method.

Fixed Flaky Tests

Flaky tests can be extremely frustrating, so having some more tools in place to fix them is always useful. An issue with a factory is what caused my initial look into using repeat, but there can be a variety of other causes, too. One of the biggest culprits I’ve seen is data not being cleaned up properly between tests, so something created in one test is affecting the next test. Another is timestamps. I’ve seen a lot of tests start failing at the start/end of a month, issues with daylight savings, or even where a test set a date that was not far enough into the future and now that date has happened, the test starts failing. You can use something like Carbon::setTestNow(today()); or other time methods in Laravel to try to avoid those issues.

Before Pest 2.0 or PHPUnit 10, you could use the --repeat=100 argument in the CLI to get similar results. However, this option was removed in PHPUnit 10, so Pest is needed. This Github issue has some workarounds if you can’t use Pest: Repeating tests · Issue #5174 · sebastianbergmann/phpunit.

Now, go and fix those flaky tests! Let me know if you have any other strategies for flaky tests. Thanks for reading!

Did you find this article valuable?

Support Sean Kegel by becoming a sponsor. Any amount is appreciated!