Overview
Czech News Center QA · Built 2026-03-31

PW-Tests

Playwright tests for CNC websites. 10 suites, 7 sites, everything wired into OpenSearch and Grafana.

96%
Pass Rate
5000
Total Tests
7
Sites
363
Commits (30d)
0
Fixes Applied
10
Test Suites
Suite Health at a Glance
Click any row to navigate to the suite detail view.
SuitePass RateTestsBarDurationTrend
Ads 100% 549/549 6.5s
Content 100% 45/45 9.6s
E2e 96.55% 700/725 9.8s
Events 100% 9/9 11.4s
Mobile 100% 270/270 3.1s
Pdt 96.53% 1950/2020 35.2s
Shadow 98.35% 417/424 1m 14s
Smoke 100% 108/108 6.4s
Unknown 95.83% 506/528 11.0s
User Flows 79.5% 256/322 9.3s
Site Health
Overall health per site, aggregated across all suites.
All 98.35%
Auto.cz 93.69%
Blesk.cz 96.07%
E15.cz 97.49%
Isport.cz 97.89%
Opinio.cz 93.42%
Reflex.cz 98.3%

Test Suites

10 suites, each checking something different. Click a card to dig into details.

All
Passing
Partial
10 suites

Sites

7 Czech News Center websites under test.

Site Comparison
SiteURLConsentHealth
All all Unknown 98.35%
Auto.cz www.auto.cz CPEX 93.69%
Blesk.cz www.blesk.cz CPEX 96.07%
E15.cz www.e15.cz Didomi 97.49%
Isport.cz isport.blesk.cz CPEX 97.89%
Opinio.cz opinio.cz CPEX 93.42%
Reflex.cz www.reflex.cz Didomi 98.3%

CI Runners

7 projects on 2 GitLab runners

Monthly Development Report

Features, tests added, and infrastructure improvements by month.

Monthly Failure Report

Failure timeline, root causes, fix stories, and unresolved issues.

Architecture

System components, data flow, and directory structure.

Data Flow
Click a node. Tests feed three parallel paths that all end at Slack.
Project Structure
Core Components

Methodology

Selector priorities, failure categories, and the auto-healing loop. Click through each one.

Selector Strategy

Pick the most stable selector you can. Click each level to see why.

Self-Healing Workflow

Test breaks? The system tries to fix it before anyone has to look.

Failure Classification

Every failure gets a category. Some we fix automatically, others need a human.

Auto-Fixable
Requires Investigation
Test Writing Template
import { test, expect } from '@playwright/test'; import { CncSite } from '../src/core'; test.describe('Feature @smoke @blesk', () => { let site: CncSite; test.beforeEach(async ({ page }) => { site = new CncSite(test, page, 'blesk'); await site.load('/', 'Homepage'); await site.consent(true); }); test('should load homepage', async () => { await expect(site.page).toHaveTitle(/Blesk/); await site.assertElementVisible('[data-testid="header"]'); }); });

Observability Stack

OpenSearch stores it, Grafana shows it, Prometheus measures it, Slack yells about it. Click any node for details.

Stack Overview
OpenSearch Indices
Index PatternPurposeRetentionUpdated By
cncqa_tests-*Test results for Grafana dashboards90 daysReporter
cncqa_events-*Detailed events for AI/machine analysis30 daysEventLogger
*-YYYY-MM-imgFailure screenshots (base64)30 days (ISM)Reporter
*-YYYY-MM-crStep records (pw:api traces)30 days (ISM)Reporter

Development Report & Timeline

What the team delivered, told in words. Release changelog below.

March 2026 March 1 – 31, 2026
Failure Intelligence Pipeline

We moved from simple substring matching to a multi-layered classification engine. Failures are now matched against a structured incident store using weighted fingerprinting across seven dimensions. A historical confidence layer tracks how often each test has failed for each root cause, blending past patterns with current evidence to produce verdicts that get smarter over time.

  • Incident store with six root cause domains and seventeen hierarchical tags
  • Weighted incident matcher scoring fingerprint, site, error category, selector overlap, date window, URL pattern, and message content
  • Historical prior service computing confidence bands from confirmed recovery events
  • Cause assessor blending both layers into a final verdict with human-readable explanation
Slack Alert Transformation

Nightly failure notifications were completely rewritten to communicate in human terms instead of dumping raw failure counts. Each failure now gets a verdict label — confirmed regression, post-fix watching, likely flaky, infra suspicion, or needs confirmation. Failures are clustered by site, and an investigation thread is posted automatically with per-failure breakdowns and next-step recommendations.

  • Recovery detection posts confirmation to original failure thread when a test passes consecutively
  • Thread state machine manages open, resolved, superseded, and stale failure threads
  • Weekly report redesigned with executive summary, trend comparison, and root cause breakdown
  • Escalation contact suggestion appended to alerts based on failure classification
Operations & Escalation System

A brand-new escalation system answers the question that classification alone could not: the test failed, QA confirmed it is real — now who do I contact? Three normalized JSON databases map contacts, sites, and routing rules across twelve escalation categories. A resolver module implements strict precedence matching and the portal page presents it all as an interactive workflow and lookup tool.

  • Seventy-eight CNC sites and twenty-seven contacts seeded from the ownership spreadsheet
  • Twelve escalation categories from content issues to video player failures
  • Five-step visual workflow on the portal: Test Fails → QA Triages → Classify → Escalate → Recovery
  • Build-time validation with eight error checks and five warning checks
Project Portal

The documentation portal launched with thirty-one pages, CNC brand design, content registry sidebar, and dark mode support. A second version is in progress with a modular build pipeline — separate collectors for git, OpenSearch, CI config, and test results feed into interactive templates that can show live data alongside static documentation.

  • New pages: escalation matrix, content registry, per-suite operational runbooks
  • Portal v2 architecture: collectors, derived data, template engine, cache layer
Grafana 2.0 Dashboard Overhaul

All three Grafana dashboards were redesigned. Status got a sites-by-suites matrix with per-cell drill-down links. Investigate replaced its table with a Dynamic Text panel showing formatted errors and stack traces. Trends got proper multi-select variables and fixed data links. The reporter was enhanced with ANSI stripping, normalization, and over five hundred step wrappers across twenty-seven test files.

  • Resolved the UUID hyphen parsing bug that caused "No data" across Investigate queries
  • Debunked the .keyword field mismatch — all indices have proper sub-fields
  • Production ISM policies and index templates deployed for retention management
Observability & Infrastructure

Several infrastructure issues were found and fixed. OpenSearch indices that had been accumulating indefinitely now rotate monthly with automatic thirty-day cleanup. A twelve-check verification script validates observability health. A Playwright-based Grafana monitor catches dashboard rendering failures that API queries cannot detect.

  • Fixed disk-full incident caused by static index names without retention
  • Unified telemetry: three loggers replaced by single EventLogger, removing twelve hundred lines of dead code
  • Visual regression tests rewritten from fifty-four failing tests to eighteen passing in under thirty seconds
Documentation Expansion

Wiki grew from fourteen to sixteen pages. Confluence standalone pages expanded from four to fifteen with per-suite operational runbooks. Eight mermaid diagrams were added or restored. A Slack message formatting guide documents verdict labels, clustered thread style, and wording rules.

Release Changelog

Per-version details. Expand any version for the full list.