Testing Glossary

Flaky Test

A test that produces inconsistent results — sometimes passing, sometimes failing — without any changes to the code under test.

A flaky test is a test that non-deterministically passes or fails even though the underlying code has not changed. Flaky tests erode confidence in a test suite because engineers can no longer trust a red build to signal a real problem. Over time, teams start ignoring failures, which defeats the purpose of automated testing.

Common Causes

Flaky tests typically stem from a handful of root causes. Shared mutable state is one of the most frequent — when tests depend on a database row, file, or in-memory object that another test modifies, execution order determines the outcome. Timing and concurrency issues are equally common: hardcoded sleeps, race conditions, and network timeouts can all produce intermittent failures. Environment dependencies such as relying on a specific timezone, locale, or external API endpoint also introduce non-determinism.

Why Flakiness Matters in CI

In a continuous integration pipeline, every commit triggers a build. If even a small percentage of tests are flaky, the probability of at least one spurious failure per build grows quickly. This leads to wasted developer time investigating phantom failures, increased cycle times, and a culture of re-running pipelines until they go green — an expensive habit at scale.

Strategies for Mitigation

The first step is detection. Monitoring tools like TestGlance track pass/fail history per test, making it straightforward to flag tests whose outcomes flip without code changes. Once identified, flaky tests should be quarantined — kept in the suite but excluded from blocking the build — until they are fixed.

Fixing usually involves improving test isolation so each test controls its own state, replacing sleeps with explicit waits or polling, and mocking external services. Test retries can act as a short-term safety net, but they should be treated as a symptom indicator rather than a cure. Retrying a test three times and accepting a single pass masks the underlying issue and inflates build duration.

Tracking flakiness as a metric alongside coverage and duration gives teams the visibility they need to keep their test suite healthy and their CI pipeline trustworthy.

Related Terms

Monitor Your Test Suite Health

TestGlance tracks test results, detects flaky tests, and surfaces health trends automatically.

Get Started