Integrations

Key	Value
Status	Active
Owner	QA Automation
Updated	2026-03-26
Scope	External systems that PW-Tests reads from, writes to, or publishes through

PW-Tests is useful on a laptop, but it becomes an operational system because of its integrations. Slack gives runs a voice, OpenSearch gives them memory, Grafana gives them shape, GitLab gives them schedules, and Confluence gives the team a readable long-term home for how everything works.

Integration Map

Integration	Main Use
Slack	failure alerts, summaries, weekly reports, recovery replies
Grafana	dashboards, trend reading, investigation links, dashboard validation
OpenSearch	structured records, screenshots, step records, history queries
GitLab	scheduled runs, post-deploy checks, manual jobs, artifacts
Confluence	published human documentation

Slack

Slack is where most people first encounter the system. That makes message quality important.

What Slack Handles

Slack Output	Why It Exists
failure alerts	immediate run awareness
success summaries	low-noise reassurance after scheduled runs
weekly and monthly reports	broader operational summary
visual notifications	specialized visual-suite outcomes
recovery replies	closes the loop when a failure stops repeating

Current Slack Direction

The system now aims for calmer, more useful alerts:

short headline first
stats and failed tests second
a human-style “Initial read” instead of raw classifier dump
investigation threads with Assessment, Why, and Next
grouped failures when several tests clearly share one cause

Slack Credentials

Variable	Purpose
`SLACK_WEBHOOK_URL`	basic delivery path
`SLACK_BOT_TOKEN`	richer posting, threading, uploads
`SLACK_CHANNEL`	primary alerts destination
`SLACK_REPORTS_CHANNEL`	weekly and monthly report destination
`SLACK_ALERTS`	CI-level alert toggle

%%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#4a90d9', 'primaryTextColor': '#fff', 'primaryBorderColor': '#2c6fad', 'lineColor': '#555', 'fontFamily': 'sans-serif'}}}%%
flowchart TD
    CI["CI nightly run fails"] --> ALERT["Main Slack alert\nheadline + stats + failed tests + Initial read"]
    ALERT --> THREAD["Investigation thread reply\nAssessment / Why / Next per failure or cluster"]
    THREAD --> WATCH["Operator watches next run"]
    WATCH --> PASS["Failure stops repeating"]
    WATCH --> PERSIST["Failure repeats → investigate"]
    PASS --> RECOVERY["Recovery reply posted\nto original thread"]
    RECOVERY --> CLOSED["Thread closed with context"]

The diagram shows the full lifecycle of a Slack failure alert from initial notification through investigation thread to recovery reply.

Grafana

Grafana is where operators go when they need to understand patterns instead of one run.

What Grafana Covers Today

Dashboard Area	Typical Use
Status	current health and recent run summaries
Investigate	failure triage and drill-down
Trends	pass rate, recurrence, and site/suite history

Important Supporting Workflows

Workflow	Why It Matters
dashboard-as-code	dashboards are reviewed and versioned like code
query validation	broken queries can be caught without opening the UI manually
visual monitoring	dashboards are checked in a browser for empty or broken panels
observability verification	health and data quality can be checked by script

OpenSearch

OpenSearch is the durable telemetry layer behind dashboards and many analysis workflows.

What It Stores

Record Type	Why It Exists
test records	human-facing dashboard and report data
event records	machine-facing event stream
screenshot records	investigation evidence
step records	deeper debugging context

Operational Reality

There are both shared-writer and admin-level access patterns in this repo. That distinction matters for retention setup, mappings, and direct-write paths.

GitLab

GitLab is the scheduler and transport layer for most production-facing behavior.

GitLab Handles

GitLab Role	Examples
schedules	nightly, visual, performance, monitor runs
manual jobs	visual baseline updates, dashboard deploys, targeted runs
artifacts	blob reports, test results, traces, screenshots
CI variables	credentials, flags, schedule type
deployment-adjacent safety	PDT, reports, visual notifications

Confluence

Confluence is where the system is explained for humans. The source lives in this repo:

Source Path	Purpose
`docs/wiki/`	main wiki tree
`docs/confluence/`	standalone Confluence pages
`.confluence/`	publishing scripts and writing rules

Important rule: these pages should read like operating documentation, not like generated command dumps.

Local Dashboard

The repo also has a local Next.js control center. It is not a hosted integration in the same sense as Slack or Grafana, but it is part of the operator experience.

Capability	Why People Use It
run overview	fast local triage
artifacts and logs	convenient local inspection
trends and flaky views	quicker than raw log reading
docs browser	local access to operational docs

Teams usually wire integrations in this order:

Slack, so failures are visible
Grafana and OpenSearch, so failures are explainable
GitLab schedules, so checks run automatically
Confluence publishing, so the docs stay aligned with the codebase

Need	Go To
env vars and credentials	Configuration Guide
alert and report behavior	Reporting
telemetry model	Logging System
publishable docs rules	`.confluence/WRITING_RULES.md` in the repo