Playwright CI on GitHub Actions: Complete 2026 Guide
Production-grade Playwright CI on GitHub Actions in 2026: sharding, caching, matrix builds, traces, reports, secrets, and deployment gates. End-to-end YAML examples.
Playwright CI on GitHub Actions: Complete 2026 Guide
The promise of Playwright in CI is straightforward: run the same tests you wrote locally against your real application on every pull request, and surface failures with the same time-travel debugging UI Mode gives you on your laptop. The reality, in 2026, is that getting from "tests pass locally" to "tests guard every deploy" still requires deliberate choices about caching, sharding, browser installation, artifact retention, and matrix strategy. Get those wrong and your CI suite becomes the slow, flaky bottleneck developers route around instead of through.
This guide is a complete blueprint for running Playwright on GitHub Actions in 2026. We will build a pipeline from scratch: install dependencies with pnpm caching, install browsers with a layered cache, shard tests across runners, merge HTML reports, expose traces and videos as artifacts, and gate deploys on green status. Every YAML file is production-ready, every command is the current syntax for Playwright 1.49+, and every choice is justified by the tradeoff it makes. We will also cover the failure modes that take down most CI suites: cold browser installs, runners running out of memory, and traces that fill artifact storage.
If you need a Playwright primer first, the Playwright E2E Complete Guide covers fundamentals. The playwright-e2e skill gives Claude Code and Cursor the CI patterns from this guide.
The minimum viable workflow
Drop this into .github/workflows/playwright.yml and you have a passing baseline that runs on every push to main and every pull request.
name: Playwright Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
timeout-minutes: 60
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- name: Install dependencies
run: npm ci
- name: Install Playwright browsers
run: npx playwright install --with-deps
- name: Run Playwright tests
run: npx playwright test
- uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: playwright-report
path: playwright-report/
retention-days: 14
This workflow takes about three minutes for a small suite. It will not scale beyond a few dozen tests because every run reinstalls browsers from scratch and does not parallelize.
Adding pnpm caching
Most production monorepos use pnpm. The pnpm action caches the store, which cuts a minute off cold installs.
- name: Install pnpm
uses: pnpm/action-setup@v3
with:
version: 9.15.0
- uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- name: Install dependencies
run: pnpm install --frozen-lockfile
actions/setup-node@v4 with cache: pnpm reads your pnpm-lock.yaml and restores the global store from the cache. Subsequent runs that have not changed the lockfile complete the install step in under thirty seconds.
Caching Playwright browsers
The largest single download in a Playwright CI run is the browser bundles. Chromium alone is roughly three hundred megabytes uncompressed. Cache them between runs and the install step finishes in seconds.
- name: Get Playwright version
id: playwright-version
run: |
VERSION=$(node -p "require('@playwright/test/package.json').version")
echo "version=$VERSION" >> "$GITHUB_OUTPUT"
- name: Cache Playwright browsers
uses: actions/cache@v4
id: playwright-cache
with:
path: ~/.cache/ms-playwright
key: playwright-${{ runner.os }}-${{ steps.playwright-version.outputs.version }}
- name: Install Playwright browsers
if: steps.playwright-cache.outputs.cache-hit != 'true'
run: pnpm exec playwright install --with-deps
- name: Install Playwright system deps only
if: steps.playwright-cache.outputs.cache-hit == 'true'
run: pnpm exec playwright install-deps
The cache key includes the Playwright version, so bumping Playwright invalidates the cache automatically and a fresh download happens on the next run. The install-deps fallback for the cache-hit path installs the OS-level dependencies (libnss, libxss, etc.) that the cache cannot store.
Sharding for parallel execution
Sharding distributes tests across runners that execute in parallel. Playwright supports sharding natively with the --shard flag.
jobs:
test:
strategy:
fail-fast: false
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v3
with:
version: 9.15.0
- uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- run: pnpm install --frozen-lockfile
- run: pnpm exec playwright install --with-deps
- name: Run shard ${{ matrix.shardIndex }} of ${{ matrix.shardTotal }}
run: pnpm exec playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
- uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: blob-report-${{ matrix.shardIndex }}
path: blob-report
retention-days: 1
Four shards process a 200-test suite in roughly a quarter of the time. Update your playwright.config.ts to use the blob reporter so shards produce mergeable output.
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
testDir: './tests',
fullyParallel: true,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 4 : undefined,
reporter: process.env.CI ? [['blob']] : 'list',
use: {
baseURL: process.env.BASE_URL ?? 'http://localhost:3000',
trace: 'on-first-retry',
video: 'retain-on-failure',
screenshot: 'only-on-failure',
},
projects: [
{ name: 'chromium', use: { ...devices['Desktop Chrome'] } },
],
});
Merging shard reports
After all shards complete, a final job merges blob reports into a single HTML report.
merge-reports:
if: ${{ !cancelled() }}
needs: [test]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v3
with:
version: 9.15.0
- uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- run: pnpm install --frozen-lockfile
- uses: actions/download-artifact@v4
with:
path: all-blob-reports
pattern: blob-report-*
merge-multiple: true
- name: Merge reports
run: pnpm exec playwright merge-reports --reporter=html ./all-blob-reports
- uses: actions/upload-artifact@v4
with:
name: html-report
path: playwright-report
retention-days: 14
The merged HTML report includes every shard's tests in a single browseable file. Download it from the workflow run, unzip, and run pnpm exec playwright show-report playwright-report to open it locally.
Cross-browser matrix
For full cross-browser coverage, expand the matrix to include browser dimension.
strategy:
fail-fast: false
matrix:
browser: [chromium, firefox, webkit]
shardIndex: [1, 2]
shardTotal: [2]
Two shards times three browsers gives six parallel runners. For a 300-test suite, total wall-clock time stays under ten minutes. Pass the browser into the test command:
- run: pnpm exec playwright test --project=${{ matrix.browser }} --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
Configure each project in playwright.config.ts:
projects: [
{ name: 'chromium', use: { ...devices['Desktop Chrome'] } },
{ name: 'firefox', use: { ...devices['Desktop Firefox'] } },
{ name: 'webkit', use: { ...devices['Desktop Safari'] } },
],
Starting your app before tests
Most CI runs need to spin up the application before Playwright connects. The webServer config option handles startup, readiness, and teardown automatically.
export default defineConfig({
webServer: {
command: 'pnpm start',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
timeout: 120_000,
stdout: 'pipe',
stderr: 'pipe',
},
use: {
baseURL: 'http://localhost:3000',
},
});
For multi-service setups, pass an array:
webServer: [
{
command: 'pnpm --filter @qaskills/api start',
port: 4000,
timeout: 60_000,
},
{
command: 'pnpm --filter @qaskills/web start',
port: 3000,
timeout: 60_000,
},
],
Playwright waits for each server to respond with HTTP 200 on the configured URL or port before starting tests, then shuts them down cleanly.
Secrets and environment variables
Tests often need API keys, database URLs, or Clerk credentials. Store them as GitHub Actions secrets, then expose to the runner.
- name: Run Playwright tests
env:
BASE_URL: ${{ secrets.STAGING_URL }}
CLERK_PUBLISHABLE_KEY: ${{ secrets.CLERK_PUBLISHABLE_KEY }}
DATABASE_URL: ${{ secrets.DATABASE_URL }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: pnpm exec playwright test
Secrets are masked in logs. Never echo them or write them to artifacts. For local replication, mirror the same names in your .env.local so the same code reads from the same places.
Trace, video, and screenshot artifacts
Failures need evidence. Configure Playwright to record artifacts only on failure, then upload them.
use: {
trace: 'on-first-retry', // record once on retry
video: 'retain-on-failure', // keep video only when test fails
screenshot: 'only-on-failure',
},
The upload step in your workflow:
- uses: actions/upload-artifact@v4
if: ${{ failure() }}
with:
name: playwright-traces-${{ matrix.shardIndex }}
path: |
test-results/**/trace.zip
test-results/**/video.webm
retention-days: 7
To investigate a CI failure locally, download the artifact and open the trace:
npx playwright show-trace ./trace.zip
The UI Mode window opens with the captured snapshots, network requests, and console output. For more on traces, read the Playwright Screenshots Videos Traces Complete Guide.
Posting results to pull requests
Use the dorny/test-reporter action or a similar to post a check run with results.
- name: Publish Test Report
if: ${{ !cancelled() }}
uses: dorny/test-reporter@v1
with:
name: Playwright Tests
path: test-results/junit.xml
reporter: jest-junit
Configure JUnit output in playwright.config.ts:
reporter: [
['junit', { outputFile: 'test-results/junit.xml' }],
['blob'],
],
For PR comments with screenshots embedded, consider the daun/playwright-report-summary action.
Deployment gates
Tie deploys to Playwright green status using job dependencies and environment protection rules.
deploy-staging:
needs: [test]
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment:
name: staging
url: https://staging.qaskills.sh
steps:
- uses: actions/checkout@v4
- run: pnpm dlx vercel pull --yes --token=${{ secrets.VERCEL_TOKEN }}
- run: pnpm dlx vercel build --prod --token=${{ secrets.VERCEL_TOKEN }}
- run: pnpm dlx vercel deploy --prebuilt --prod --token=${{ secrets.VERCEL_TOKEN }}
The needs: [test] line forces the deploy job to wait for all Playwright shards to complete successfully. Use GitHub environment protection rules for additional approval gates.
A complete production workflow
Putting every piece together, here is the workflow used by a real production team that runs 400 Playwright tests on every PR.
name: Playwright CI
on:
push:
branches: [main]
pull_request:
branches: [main]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
test:
timeout-minutes: 30
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
browser: [chromium, firefox, webkit]
shardIndex: [1, 2, 3]
shardTotal: [3]
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v3
with:
version: 9.15.0
- uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- run: pnpm install --frozen-lockfile
- name: Cache Playwright browsers
uses: actions/cache@v4
id: cache
with:
path: ~/.cache/ms-playwright
key: playwright-${{ runner.os }}-${{ hashFiles('pnpm-lock.yaml') }}
- run: pnpm exec playwright install --with-deps ${{ matrix.browser }}
if: steps.cache.outputs.cache-hit != 'true'
- run: pnpm exec playwright install-deps ${{ matrix.browser }}
if: steps.cache.outputs.cache-hit == 'true'
- name: Run tests
env:
BASE_URL: ${{ secrets.STAGING_URL }}
DATABASE_URL: ${{ secrets.DATABASE_URL }}
run: pnpm exec playwright test --project=${{ matrix.browser }} --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
- uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: blob-${{ matrix.browser }}-${{ matrix.shardIndex }}
path: blob-report
retention-days: 1
- uses: actions/upload-artifact@v4
if: ${{ failure() }}
with:
name: traces-${{ matrix.browser }}-${{ matrix.shardIndex }}
path: test-results
retention-days: 7
merge-reports:
if: ${{ !cancelled() }}
needs: [test]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v3
with:
version: 9.15.0
- uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- run: pnpm install --frozen-lockfile
- uses: actions/download-artifact@v4
with:
path: all-blob-reports
pattern: blob-*
merge-multiple: true
- run: pnpm exec playwright merge-reports --reporter=html ./all-blob-reports
- uses: actions/upload-artifact@v4
with:
name: html-report
path: playwright-report
retention-days: 14
Cost tuning
GitHub Actions bills by minute on private repos. The most expensive shape is a long matrix run that hits the timeout. Optimize:
| Lever | Impact |
|---|---|
| pnpm + lockfile cache | -60 seconds per job |
| Playwright browser cache | -45 seconds per job |
fullyParallel: true | 2-4x worker speedup per shard |
| Sharding | Wall-clock divided by shard count |
concurrency.cancel-in-progress | Cancels superseded PR runs |
fail-fast: false | Costs more but surfaces all failures |
| Smoke vs full project split | Run smoke on PR, full on main |
For a typical 200-test suite, applying every lever brings wall-clock time from 12 minutes down to under 4.
Common pitfalls
Cold runners with stale apt repos. apt-get update can fail intermittently. The --with-deps flag handles this internally; pin to it.
Tests passing locally but failing in CI. Almost always timing. Increase expect.timeout and add explicit waits for network responses. Avoid waitForTimeout.
Artifact storage overruns. Default retention is 90 days. Cut to 7-14 days for traces, 30 for reports. Use if: failure() to upload traces only on failure.
Flaky parallelism. When tests share state via DB or filesystem, parallel runs collide. Use unique data per worker (process.env.TEST_WORKER_INDEX) or isolate with transactions.
Long install times. If browser cache is invalidating every run, your key includes a frequently changing input. Use the Playwright version, not the lockfile hash.
Anti-patterns
- Running the entire suite on every PR for repositories with hundreds of tests. Run a smoke subset; gate merge to main on full coverage.
- Storing secrets in workflow files or in commit history. Always use
secrets.*. - Conflating CI orchestration with test logic. Keep
playwright.config.tsportable so developers can replicate CI locally withCI=1 pnpm test. - Recording video for every test. WebM files compound rapidly; use
retain-on-failure. - Ignoring
fail-fastin matrix runs.fail-fast: falsefinishes the full matrix and surfaces every failure, which costs more but saves a second debug round.
Conclusion and next steps
A well-tuned Playwright CI on GitHub Actions runs in single-digit minutes, caches everything cachable, shards smart, and keeps artifacts that matter. Once the pipeline is green, the next layer is reliability: read Playwright Retries Flaky Test Handling Guide for the patterns that keep CI from becoming the boy who cried wolf. Pair with the playwright-e2e skill so AI assistants generate tests that play well in shards and parallel runs from day one.