Logging System

Key	Value
Status	Active
Owner	QA Automation
Updated	2026-03-26
Scope	EventLogger, reporter records, artifacts, and how run evidence is stored

The logging system is the backbone of everything else in PW-Tests. If you understand what gets written, you understand why Slack can be more informative, why Grafana can answer useful questions, and why failure history can improve over time.

What The Logging System Is For

Goal	Why It Matters
preserve run evidence	failures should be debuggable after the fact
support operators	humans need summaries and artifacts
support automation	scripts need structured records, not just console text
support history	recurrence, priors, and trends depend on stored data

Main Parts

Part	What It Produces
EventLogger	structured event stream
reporter	test summaries, screenshot records, step records
local artifacts	logs, traces, screenshots, run files
OpenSearch	durable searchable telemetry
history files	recurrence and recovery memory

EventLogger

The EventLogger replaced the older split logging model. It now acts as the common event pipeline across the suites that support it.

EventLogger Characteristic	Why It Helps
buffered writes	avoids noisy per-event file churn
local plus remote output	works locally, scales in CI
shared identifiers	links events to reporter records and runs
schema versioning	makes data evolution safer

%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#4a90d9', 'primaryTextColor': '#fff', 'primaryBorderColor': '#2c6fad', 'lineColor': '#555', 'fontFamily': 'sans-serif'}}}%%
flowchart LR
    TEST["Test run"] --> EL["EventLogger\nbuffered in-memory"]
    EL --> JSONL["JSONL logs\ntest-results/logs/"]
    EL --> OS["OpenSearch\ncncqa_tests-*\ncncqa_events-*"]
    OS --> GR["Grafana dashboards\nhuman-readable panels"]
    OS --> DASH["Control Center\nlocalhost:3001"]
    JSONL --> HEAL["Healer\nfailure triage"]
    JSONL --> FACT["Visual fact-checker\nscreenshot review"]

The diagram shows how EventLogger feeds both local JSONL files and remote OpenSearch, and how downstream consumers read from each store.

Reporter Output

The reporter adds the human-facing layer used by Grafana and some Slack/reporting workflows.

Reporter Output	Typical Use
test records	suite and failure dashboards
screenshot records	visual context in investigation
step records	reconstructing what happened before failure
merged run result data	CI notifications and reports

Where Evidence Lives

Path Or Store	Contents
`test-results/logs/`	structured local event logs
`test-results/artifacts/`	per-test failure artifacts
`test-results/results.json`	merged run result summary
`test-results/history/`	run history used for recurrence logic
OpenSearch `cncqa_tests-*`	human-facing records
OpenSearch `cncqa_events-*`	event stream records

Why Structured Logging Matters

Structured records make these workflows possible:

classify failures by shape instead of only by error text
detect repeated failures across days
feed Grafana with consistent fields
match a new failure against known incidents
decide whether a thread should receive a recovery reply

Common Record Types

Record Family	Examples
run-level	start, end, summary, duration
test-level	pass, fail, skip, retry
step-level	action, navigation, selector interaction
evidence-level	screenshot and trace linkage
enrichment-level	category, severity, cause assessment

Human And Machine Consumers

Consumer	What It Reads
Slack alerts	merged results, failure summaries, history
Grafana	test records, screenshots, step records
observability scripts	OpenSearch queries and quality signals
healing workflows	local logs, history, fix memory
operators	artifacts, traces, screenshots, markdown runbooks

Logging Gotchas Worth Remembering

Gotcha	Why It Matters
old and new data may coexist in shared indices	aggregate queries can be misleading if they mix generations
some workflows need direct OpenSearch access	not every index or admin task should go through the Grafana proxy
visual failures and artifact-heavy suites need special handling	screenshots are useful, but artifact bloat is real
history files matter	recurrence features degrade if history is missing or not merged correctly

Situation	Best Evidence
single failing test	artifacts plus local logs
repeated nightly failure	history plus incident store
dashboard looks wrong	OpenSearch records plus dashboard validation
flaky interaction	step records and trace
confusing Slack alert	merged results and notifier logic

Bottom Line

PW-Tests logging is not there for posterity. It is there to shorten the time between “something failed” and “we know what to do next.”