Failure Categories

KeyValue
StatusActive
OwnerQA Automation
Updated2026-03-26
ScopeSymptom categories, root-cause labels, and how to interpret failures without overreacting

One of the easiest ways to make test operations worse is to mix up symptom and cause. A timeout is not a root cause. A selector miss is not always a test bug. This page separates the categories the system uses so people can interpret failures more accurately.

Two Different Category Layers

LayerPurpose
symptom categorywhat the failure looked like technically
root-cause labelwhat we believe actually caused it

Example:

  • symptom category: TIMEOUT_ELEMENT
  • root-cause label: PRODUCT.SITE_REDESIGN

That distinction matters because the action is different.

Common Symptom Categories

Symptom CategoryWhat It Usually Means
SELECTOR_NOT_IN_DOMexpected element is gone or renamed
TIMEOUT_ELEMENTelement or action did not complete in time
ELEMENT_INTERCEPTEDsomething blocked an interaction
NETWORK_FAILEDnavigation or resource request failed
HTTP_5XXbackend or edge returned server error
HTTP_4XXpage or endpoint not found or not allowed
PAGE_CRASHEDbrowser page crashed
CONTENT_MISMATCHexpected content shape does not match live output
consent-related categoriesoverlay or consent handling blocked progress

Common Root-Cause Themes

The incident system uses more human-meaningful causes than raw failure text.

Root-Cause DomainTypical Meaning
PRODUCT.*the site changed in a real way
TEST.*the test or selector assumption is wrong
ENVIRONMENT.*CI, timing, or network instability
CONTENT_DATA.*content is missing, rotated, or structurally different
THIRD_PARTY.*outside dependency drift
RELEASE_CHANGE.*deploy or release-specific side effect

How Operators Should Read A Failure

If You See...Ask Next...
isolated timeoutis this a flaky one-off or a known weak spot?
repeated selector miss on one sitedid the site redesign or did our selector age out?
same failure across unrelated testsis this really infra or shared environment noise?
post-fix repeatis the fix incomplete, or is this a new symptom of an old issue?

Confidence Labels In Slack

These are not raw categories. They are operator-facing verdicts built from categories plus context.

LabelWhat It Tells The Reader
Confirmed regressionstrong evidence of a real product or test regression
Needs confirmationnot enough signal yet
Likely flakybehavior looks intermittent rather than systemic
Infra/CI suspicionshared environment or runner problem is more likely
Post-fix, watchingissue matches something recently fixed and is being monitored
Likely same infra eventsecondary failure probably belongs to the same environment issue

Why Historical Context Matters

A single failure run can be misleading. History changes interpretation:

  • one timeout after a fix may be watch-only
  • ten repeats over seven days is a pattern
  • a failure with a matching resolved incident is very different from a brand-new unknown

This is why failure history, incident store data, and recovery tracking now matter so much.

PatternGood First Response
one failing test, no history, selectors look healthyrerun and watch
several unrelated tests on one site all hit NETWORK_FAILED in one runtreat as infra suspicion first
same PDT test repeats after a visible redesignlikely product or test update needed
visual suite says snapshot missingconfiguration issue, not UI regression

What Not To Do

  • do not equate timeout with product bug automatically
  • do not treat every selector miss as test negligence
  • do not post “regression detected” when the suite is actually blocked by missing baselines or infra
  • do not bury the real verdict under raw classifier vocabulary
NeedGo To
human-facing alert behaviorReporting
incident matching and AI workflowsAI Processing
troubleshooting specific failuresTroubleshooting