Playwright vs Cypress for E2E?

Playwright ships multi-browser support, strong parallelism, and trace debugging out of the box. Cypress offers a different DX with a focused browser model—choose based on team skill and existing suites.

Do we run Playwright against production?

Usually against staging, preview deploys, or canary URLs—not live user traffic. For production health, use synthetic monitors and RUM; Playwright in production is typically scheduled, read-only smoke, not load testing.

Playwright E2E Testing | Automation· Locators

Q: Why are my tests flaky?

Avoid fixed sleeps; use locators that auto-wait, assert on network responses with waitForResponse, and stabilize data with route interception or seeded backends. Reuse storageState for auth instead of logging in per test.

2026년 4월 7일 · 22분 읽기 · 수정 2026년 4월 7일 intermediate tutorial

이 글의 핵심

This guide translates the practical Playwright E2E Korean post—installation, locators, interactions, network interception, auth setup, CI—and adds internals: browser contexts, CDP, parallelism, trace artifacts, and production-adjacent patterns like synthetic checks against staging.

Introduction

I use Playwright, Microsoft’s cross-browser automation library, for end-to-end (E2E) tests. I run them to exercise the app the way a user would: drive the browser, submit forms, follow redirects, and check that critical journeys still work after refactors, dependency upgrades, and deploys. I wrote this for anyone who wants a practical, production-style setup—not just a single happy-path spec.

Real-world E2E scenarios I keep in mind:

Checkout and payments: a user signs in, adds items to a cart, applies a coupon, and reaches the confirmation page while I assert both UI text and a settled payment API call.
SaaS onboarding: invite flow, email verification (stubbed in lower environments), role assignment, and first-run empty states—often mixed with feature flags I verify in on/off states.
Internal admin tools: data grids, filters, bulk actions, and export. These UIs are slow and stateful; I need stable locators and often API seeding or route interception to avoid depending on live third parties.
Regulated or high-risk domains: healthcare and finance often require audit-friendly traces and reproducible test data. I lean on Playwright’s trace viewer, video, and HAR-style artifacts to see what the user (the test) actually did before a failure.
Preview deployments: every pull request gets a unique URL. I run the same suite in CI against that URL to catch environment-specific failures before merge.

For a broader feature tour of the ecosystem, I still send people to [Playwright Complete Guide](/en/blog/playwright-complete-guide/ when they want the full map.

flowchart LR

  subgraph dev["How I work"]

    A[Code change] --> B[Unit tests]

    B --> C[PR]

  end

  C --> D[E2E on preview URL]

  D -->|pass| E[Merge]

  D -->|fail| F[Trace + fix]

  F --> A

  E --> G[Deploy to staging]

  G --> H[Smoke / nightly]

Why Playwright?

Picking a browser automation tool is a long-term bet. I keep reaching for Playwright when I care about a few things at once, and I compare it in my head to Selenium and Cypress without pretending there is a single winner.

On browser coverage, Playwright gives me Chromium, Firefox, and WebKit (plus mobile emulation projects) in one runner. Cypress has been strongest in the Chrome family, with other engines depending on the version and architecture I run. Selenium still wins if I need “every WebDriver browser” in a huge enterprise grid, but I pay for that in operations.

Parallelism is where I feel Playwright: workers, projects, and sharding are first-class. Cypress has improved here, but the mental model and defaults have historically been more opinionated. Selenium parallelizes, yet I am usually the one running and scaling a grid.

Auto-waiting is good in both Playwright and Cypress; the difference I notice is the command model and how failures surface. Selenium can be more manual unless I wrap it.

Multiple tabs and windows are straightforward in Playwright in a single test. Cypress used to be tighter here; I still verify the current release if that is a hard requirement. Selenium can do it, but the code gets long.

Network hooks matter to me. I use Playwright’s route, fulfill, continue, HAR, and WebSocket events constantly. Cypress’s cy.intercept is powerful but shapes tests differently. With Selenium I lean on BiDi, CDP, or proxies depending on the stack.

Debugging is personal: I live in Playwright’s trace viewer, UI mode, screenshots, and video on failure. Cypress’s time-travel DOM is a different flavor of “what happened,” and it can be great. Selenium’s story depends a lot on what I wrap it with.

Languages: I stay in TypeScript, but I like that Playwright also ships first-class support elsewhere. Cypress is JavaScript/TypeScript–centric for most teams. Selenium meets everyone where they are.

When I pick Playwright

I need one runner to cover multiple engines (especially WebKit for Safari-like behavior) without maintaining entirely separate harnesses.
I want speed at scale: parallel workers, shards in CI, and per-test browser contexts for isolation.
I care about debuggability after the fact: Playwright Traces bundle DOM snapshots, network, and console in one artifact.

When I slow down and reconsider

The org already has hundreds of thousands of lines in Selenium or Cypress, and migration cost outweighs benefits—I still sometimes add new specs in Playwright and let the suite grow that way.
I only ever target Chromium in CI and the team is deeply productive in Cypress; switching tools might not pay off on its own.

Installation and setup

I install Playwright a few different ways depending on greenfield vs brownfield and how the monorepo is laid out.

Method A: `npm` / `pnpm` / `yarn` in an existing Node project


npm init playwright@latest

# or: pnpm create playwright

# or: yarn create playwright

The initializer adds @playwright/test, example tests, a playwright.config.ts, and a GitHub Actions workflow if I request it. It also runs npx playwright install to download browser binaries (optionally with --with-deps on Linux CI).

Method B: Add to an existing repository manually


npm i -D @playwright/test

npx playwright install

On Linux agents I usually need system dependencies; I run npx playwright install-deps where appropriate (CI often uses the official Docker image or the --with-deps flag the docs recommend for the distribution I use).

Method C: `package.json` scripts


{

  "scripts": {

    "test:e2e": "playwright test",

    "test:e2e:ui": "playwright test --ui",

    "test:e2e:debug": "PWDEBUG=1 playwright test",

    "test:e2e:headed": "playwright test --headed"

  }

}

On Windows, I use set PWDEBUG=1 or cross-env for environment variables in npm scripts if I need portability.

Minimal `playwright.config.ts`

The following is my balanced starter: baseURL avoids repeated full URLs, webServer boots my Vite/Next/Express app for local runs, and use.trace gives me on-first-retry traces in CI.


// playwright.config.ts

import { defineConfig, devices } from '@playwright/test';



export default defineConfig({

  testDir: './e2e',

  fullyParallel: true,

  forbidOnly: !!process.env.CI,

  retries: process.env.CI ? 2 : 0,

  workers: process.env.CI ? 2 : undefined,

  reporter: [['html', { open: 'never' }], ['list']],



  use: {

    baseURL: 'http://127.0.0.1:3000',

    trace: 'on-first-retry',

    screenshot: 'only-on-failure',

    video: 'retain-on-failure',

  },



  webServer: {

    command: 'npm run start:e2e', // e.g. build + start, or Vite preview

    url: 'http://127.0.0.1:3000',

    reuseExistingServer: !process.env.CI,

    timeout: 120_000,

  },



  projects: [

    { name: 'chromium', use: { ...devices['Desktop Chrome'] } },

    { name: 'firefox', use: { ...devices['Desktop Firefox'] } },

    { name: 'webkit', use: { ...devices['Desktop Safari'] } },

  ],

});

reuseExistingServer is delightful locally: if I already have the dev server running, Playwright attaches instead of spawning a second process.

First tests (step by step)

1) Create a spec file

e2e/example.spec.ts


import { test, expect } from '@playwright/test';



test('home page has the expected title', async ({ page }) => {

  await page.goto('/');

  await expect(page).toHaveTitle(/My App/i);

});

page.goto respects baseURL from the config, so '/' becomes my app root.

2) Use `test.step` for readable reports


import { test, expect } from '@playwright/test';



test('sign-up happy path', async ({ page }) => {

  await test.step('Open landing', async () => {

    await page.goto('/');

    await page.getByRole('link', { name: 'Sign up' }).click();

  });



  await test.step('Fill form', async () => {

    await page.getByLabel('Email').fill('[email protected]');

    await page.getByLabel('Password').fill('Str0ng!pass');

    await page.getByRole('button', { name: 'Create account' }).click();

  });



  await test.step('Assert success', async () => {

    await expect(page.getByText('Check your email')).toBeVisible();

  });

});

3) Group related checks with `test.describe`


test.describe('Account settings', () => {

  test('updates display name', async ({ page }) => {

    // ...

  });



  test('validates phone format', async ({ page }) => {

    // ...

  });

});

My mental model: one primary assertion per test is ideal for failure diagnosis, but I am fine with a small cluster of related expectations when they describe a single user intent.

Locator strategies

I treat Playwright’s locator engine as a nudge toward resilient queries: I prefer role, label, and test ids over long CSS paths I copied from the inspector.

`getByRole` (preferred for interactive elements)


await page.getByRole('button', { name: 'Submit' }).click();

await page.getByRole('textbox', { name: 'Email' }).fill('[email protected]');

await page.getByRole('combobox', { name: 'Country' }).selectOption('us');

Roles map to the accessibility tree; if my app has poor semantics, I feel the pain here—which is a useful signal to fix a11y.

`getByLabel` and `getByPlaceholder`


await page.getByLabel('Password', { exact: true }).fill('secret');

await page.getByPlaceholder('Search…').fill('playwright');

`getByTestId` (when UI is unstable or repeated)

In the app, I set data-testid (or configure an alternative attribute) and assert against it.


// playwright.config.ts (snippet)

// use: { testIdAttribute: 'data-qa' }



await page.getByTestId('checkout-summary-total').click();

getByTestId is excellent for leaf widgets; I avoid using it for everything, or I duplicate what roles already express.

CSS selectors and `locator`


// Prefer a narrow section container + role

const dialog = page.locator('[data-testid="delete-dialog"]');

await dialog.getByRole('button', { name: 'Delete' }).click();

Chaining locators (page.getByRole('navigation').getByRole('link', { name: 'Docs' })) scopes queries and often beats one giant selector string.

Auto-waiting and assertions

How auto-waiting works

When I click, fill, or press, Playwright retries until the element is:

attached,
visible,
enabled (for actions that require it),
stable (not animating in a way that would miss the hit target),
receiving events (e.g. not covered by a modal in front).

This is not magic: if the element never becomes actionable, the action times out. I use tighter locators and deterministic app state instead of page.waitForTimeout(3000).

The `expect` web-first assertions

I lean on Playwright’s expect for locators because it is async-aware; it waits until the condition holds or a timeout is reached.


import { test, expect } from '@playwright/test';



test('cart updates count', async ({ page }) => {

  await page.goto('/shop');

  await page.getByRole('button', { name: 'Add to cart' }).first().click();

  await expect(page.getByRole('status')).toHaveText('Cart (1)');

  await expect(page.getByTestId('cart-count')).toHaveText('1');

});

I use toBeVisible / toBeHidden / toBeEnabled for state transitions, and toHaveURL for navigation I expect the app to perform.

Tying assertions to network completion

const responsePromise = page.waitForResponse((res) => {

  return res.url().includes('/api/cart') && res.request().method() === 'POST' && res.ok();

});



await page.getByRole('button', { name: 'Add to cart' }).click();

const res = await responsePromise;

const json = await res.json();

expect(json.itemCount).toBeGreaterThan(0);

I use this pattern to avoid races where the UI has not yet received data.

API testing: mixing API calls with UI tests

E2E does not mean “UI only” to me. My tests often need fast setup or server-truth validation. Playwright exposes request in the test fixture (APIRequestContext) that shares storage with the page when I configure it that way, which I use for login shortcuts.

`request` in isolation (pure API spec)


import { test, expect } from '@playwright/test';



test('health endpoint', async ({ request, baseURL }) => {

  const res = await request.get(`${baseURL}/api/health`);

  expect(res.ok()).toBeTruthy();

  await expect(res).toBeOK();

});

Seeding, then opening the UI


import { test, expect } from '@playwright/test';



test('shows invoice after API creates it', async ({ page, request, baseURL }) => {

  const create = await request.post(`${baseURL}/api/invoices`, {

    data: { customerId: 'cust_123', amount: 2500 },

  });

  expect(create.ok()).toBeTruthy();

  const { id } = await create.json();



  await page.goto(`/invoices/${id}`);

  await expect(page.getByRole('heading', { name: 'Invoice' })).toBeVisible();

  await expect(page.getByText('25.00')).toBeVisible();

});

If I cannot share cookies automatically between my API client and the page, I use explicit token injection in headers for API calls, and I set storage state in the browser as described in the next section.

Authentication patterns

Pattern 1: login once, reuse `storageState` (recommended in CI)

// e2e/auth.setup.ts

import { test as setup, expect } from '@playwright/test';



const authFile = 'playwright/.auth/user.json';



setup('authenticate', async ({ page }) => {

  await page.goto('/login');

  await page.getByLabel('Email').fill(process.env.E2E_USER!);

  await page.getByLabel('Password').fill(process.env.E2E_PASS!);

  await page.getByRole('button', { name: 'Sign in' }).click();



  await expect(page).toHaveURL('/dashboard');

  await page.context().storageState({ path: authFile });

});


// playwright.config.ts (snippet)

// projects: [

//   { name: 'setup', testMatch: /.*\.setup\.ts/ },

//   { name: 'chromium', use: { storageState: 'playwright/.auth/user.json' }, dependencies: ['setup'] },

// ]

Pattern 2: set cookies or localStorage manually (when I own the token)


import { test } from '@playwright/test';



test('visit protected route with pre-set token', async ({ page, context }) => {

  await context.addCookies([

    { name: 'session', value: 'opaque-token-123', domain: '127.0.0.1', path: '/' },

  ]);



  await page.goto('/app');

  // or use page.addInitScript to seed localStorage before navigation

});

addInitScript runs before any page scripts; I use it for localStorage or feature flag overrides.

await page.addInitScript((flags) => {

  window.localStorage.setItem('flags', JSON.stringify(flags));

}, { newBilling: true });

Okta, SAML, or social logins with third-party popups complicate storageState. In those cases, dedicated “auth helper” functions with retry and generous timeouts, plus a stable test user, are what I use to keep CI honest.

sequenceDiagram

  participant CI

  participant App

  participant IdP

  CI->>App: run setup project

  App->>IdP: login (once)

  IdP-->>App: session cookie

  App-->>CI: write storageState JSON

  CI->>App: run sharded tests with same session

Fixtures and page objects

Test-scoped custom fixtures


// e2e/fixtures.ts

import { test as base } from '@playwright/test';

import { AccountPage } from './pages/AccountPage';



type Fixtures = { accountPage: AccountPage };



export const test = base.extend<Fixtures>({

  accountPage: async ({ page }, use) => {

    const account = new AccountPage(page);

    await use(account);

  },

});



export { expect } from '@playwright/test';


// e2e/pages/AccountPage.ts

import { type Page, type Locator } from '@playwright/test';



export class AccountPage {

  readonly page: Page;

  readonly displayName: Locator;



  constructor(page: Page) {

    this.page = page;

    this.displayName = page.getByLabel('Display name');

  }



  async goto() {

    await this.page.goto('/account');

  }



  async save() {

    await this.page.getByRole('button', { name: 'Save changes' }).click();

  }

}

How I structure page code

Page objects are composable in my suites: I expose intent ( save ) not implementation ( click #btn-12 ) unless that id is a stable data-testid.
I avoid giant base classes. I keep many small page fragments.
For cross-cutting concerns, I use fixtures (logging, per-test API client, default headers).

Parallel execution and sharding

Worker parallelism

fullyParallel: true runs independent tests in parallel within a file, subject to workers and machine CPU. I opt out for files that must run serially (shared DB counter, dangerous admin actions) using test.describe.configure({ mode: 'serial' }) sparingly.

Sharding in CI

I split a suite across machines to reduce wall-clock time:


npx playwright test --shard=1/4

npx playwright test --shard=2/4

npx playwright test --shard=3/4

npx playwright test --shard=4/4

Each job uploads its own blob-report (see Playwright merge-reports in docs) or HTML artifacts; how I merge depends on what leadership wants in one place.

What I have seen in practice: four shards on four runners can approach 4× throughput if tests are not bottlenecked on a single shared test database. For DB-heavy apps, I add isolation (schemas per worker, per-run UUID prefixes, or containers per shard).

Visual testing (screenshots)

toHaveScreenshot stores a baseline in my repo the first time I run with --update-snapshots, then future runs compare new screenshots against that baseline. I use a dedicated project or expect.configure({ soft: true }) selectively—visual tests are powerful but get noisy when fonts, OS, and GPU differ between my laptop and Linux CI.

import { test, expect } from '@playwright/test';



test('dashboard layout', async ({ page }) => {

  await page.goto('/dashboard');

  await expect(page).toHaveScreenshot('dashboard.png', {

    maxDiffPixels: 20,

    animations: 'disabled',

  });

});

Stability tips

I disable animations where possible; I hide the caret in inputs if it flickers.
I set a fixed viewport and time zone: use: { viewport: { width: 1280, height: 720 }, timezoneId: 'UTC' }.
If Linux CI and macOS local baselines differ, I keep separate baselines with snapshotPathTemplate or I run visual checks only on a single OS project.

For failure triage, I always upload playwright-report, traces, and the expected vs actual images as CI artifacts.

Network interception (mocking APIs)

Stub JSON responses


import { test, expect } from '@playwright/test';



test('empty state when API returns no items', async ({ page }) => {

  await page.route('**/api/items', (route) =>

    route.fulfill({

      status: 200,

      contentType: 'application/json',

      body: JSON.stringify({ items: [] }),

    }),

  );



  await page.goto('/items');

  await expect(page.getByText('No items yet')).toBeVisible();

});

Modify requests, pass through, or assert calls


import { test, expect } from '@playwright/test';



test('sends x-request-id', async ({ page }) => {

  let sawHeader = false;



  await page.route('**/api/orders**', async (route) => {

    const h = route.request().headers();

    sawHeader = typeof h['x-request-id'] === 'string';

    await route.continue();

  });



  await page.goto('/orders');

  await page.getByRole('button', { name: 'Refresh' }).click();

  expect(sawHeader).toBeTruthy();

});

I centralize page.route in fixtures for repeated GraphQL operations so my scenarios stay readable and consistent across specs.

flowchart TB

  subgraph test["E2E test"]

    R[page.route fulfill / continue] -->|deterministic| UI[UI behavior]

  end

  subgraph app["App under test"]

    UI --> FE[Front-end]

    FE -->|API call| MSW[Mocked / stubbed HTTP]

  end

If my app already uses MSW in the browser, I pick one primary mocking layer in E2E: double mocks (MSW and Playwright route) are a frequent source of “works locally, fails in CI” confusion for me.

CI/CD: GitHub Actions (complete example)

Below is a single workflow I keep in .github/workflows/playwright.yml. It caches npm, installs with OS dependencies, runs shards, uploads traces and HTML report on failure, and works with npx cache.


name: Playwright

on:

  push:

    branches: [ main ]

  pull_request:

    branches: [ main ]



jobs:

  test:

    timeout-minutes: 30

    runs-on: ubuntu-latest

    strategy:

      fail-fast: false

      matrix:

        shard: [1, 2, 3, 4]

    steps:

      - uses: actions/checkout@v4



      - uses: actions/setup-node@v4

        with:

          node-version: 20

          cache: npm



      - name: Install dependencies

        run: npm ci



      - name: Install Playwright browsers

        run: npx playwright install --with-deps



      - name: Build app (if needed for preview binary)

        run: npm run build



      - name: Run Playwright (shard ${{ matrix.shard }}/4)

        run: npx playwright test --shard=${{ matrix.shard }}/4

        env:

          CI: true

          E2E_USER: ${{ secrets.E2E_USER }}

          E2E_PASS: ${{ secrets.E2E_PASS }}



      - uses: actions/upload-artifact@v4

        if: always()

        with:

          name: playwright-report-shard-${{ matrix.shard }}

          path: |

            playwright-report/

            test-results/

          retention-days: 7

Merge sharded reports (optional) — Playwright 1.37+ supports blob reporters and a merge step; if I need a single HTML for management, I add the blob + npx playwright merge-reports flow from the current docs for my installed version.

I pair retries in config with trace: 'on-first-retry' so flaky failures still produce a trace without tracing every pass.

Debugging: UI Mode and the trace viewer

UI Mode


npx playwright test --ui

I lean on interactive selection of tests, time-travel watch mode, and live picking locators in the test runner UI when I want to move fast.

`PWDEBUG=1`


PWDEBUG=1 npx playwright test e2e/example.spec.ts

It opens a headed debug session with the inspector; I step through, edit locators, and read live logs.

Traces

I enable tracing for failures in CI, download trace.zip from the artifact, and open with:


npx playwright show-trace trace.zip

I see filmstrip, network, console, and source in one place—faster than screenshots alone when I am diagnosing race conditions and unhandled rejections in the app.

flowchart LR

  T[Failed test] --> A[test-results/]

  A --> B[trace.zip]

  B --> S[playwright show-trace]

  S --> C[Time-aligned DOM + network]

Debugging flaky tests that only failed in CI

I have lost whole afternoons to tests that were green on my machine and red in GitHub Actions—not every run, just often enough to erode trust. The worst case for me was a checkout spec that waited on a “success” banner while the cart API on Linux CI returned 200 a few hundred milliseconds later than on my M-series Mac. Locally the banner and the network line up; in CI, I sometimes clicked “next” before the client state caught up, and the assertion saw a toast from the previous step.

What saved me was not another waitForTimeout(500). I pulled the trace.zip, stepped the filmstrip, and watched the request waterfall: the symptom was order, not a missing element. I switched to waitForResponse on the cart POST, then asserted on the UI. I also pinned timezoneId and the viewport in config because another “CI-only” flake turned out to be layout from a different font subset on Ubuntu.

That pattern repeats: the failure looks like a locator problem until the trace shows timing, double mocks, or stale storage from a prior test. When I am debugging flaky tests that only fail in CI, the trace is where I start—and I am suspicious of anything I cannot reproduce with PWDEBUG=1 and a cold cache locally.

Test runner and browser architecture (internals)

Playwright Test (the @playwright/test runner) schedules tests across worker processes for isolation.
Each worker launches browser instances (Chromium, Firefox, WebKit) as needed.
A BrowserContext is a lightweight isolated session (cookies, storage, permissions) — ideal for parallel tests without cross-talk.
Pages live inside contexts; my test fixture receives a page tied to one context per test by default.

Chrome DevTools Protocol (CDP) and friends

For Chromium, Playwright commands flow through CDP (and similar protocols for other engines). That is why actions like tracing, network interception, and screenshots feel first-class: the runner controls the browser at the same layer as DevTools.

flowchart TB

  subgraph worker["Test worker process"]

    PWT["@playwright/test"]

  end

  PWT -->|protocol| BR[Browser]

  BR -->|BrowserContext| CTX1[Context: test A]

  BR -->|BrowserContext| CTX2[Context: test B]

  CTX1 --> PG1[Page]

  CTX2 --> PG2[Page]

This mental model also explains why contexts: isolating localStorage and cookies is cheaper than new browser processes.

Coverage instrumentation (brief)

E2E tests generally do not drive line coverage the way Istanbul does for unit tests. If I need coverage of the frontend bundle in CI, I instrument the build and merge reports, or—more often—I treat E2E as journey coverage (critical user paths) separate from unit test metrics. Mixing the two in one gate can slow every PR; I keep Vitest/Jest as the home for line coverage, and E2E for the highest value user flows.

My testing rules

These are the rules I actually follow—not because a checklist says so, but because I have paid for the alternatives in CI minutes and missed dinners.

I reach for getByRole and labels first. They match what a user and assistive tech see, and they resist the brittle nth-child chains I used to copy out of DevTools. When I do use CSS, I scope with a container and a role, not one giant string.

I refuse waitForTimeout as a crutch. It hides the race, and it adds dead time to every run. I tie actions to network or to assertions that expect can retry.

I keep test data under my control: seed APIs, route stubs, or factories—not whatever production happens to return that day.

I log in once per worker where I can, with storageState, instead of running full OAuth in every file. I have watched too many suites spend a minute on login before they assert a label.

I keep one primary intent per spec so a failure names a user story. I do not pack ten expectations into the same test unless they are truly one beat.

I shard in CI so no single job owns a ninety-minute wall clock. I store traces on retry so intermittent failures still leave a trail I can hand to someone else.

I break long flows with test.step so HTML reports read like a story, and I tag (@smoke) so a push does not have to run every line on every change.

Issues I’ve debugged

“Element not found” is rarely Playwright’s fault in my runs. I have chased shadow DOM, portals that mount outside the subtree I was querying, and copy-pasted selectors that worked until the product team renamed a div. When I get stuck, I fire up npx playwright codegen for a prototype locator, or I use locator.filter to disambiguate two buttons with the same label.

Timeout on click often means the element is there but not receiving the click—an overlay, a cookie banner, or pointer-events: none on a full-screen div. I dismiss the layer or assert the top dialog is the one I think it is, then I retry the action. The trace’s screenshot strip makes this obvious once I look.

“Flaky in CI only” was my Linux fonts shifting flex layout by a few pixels, and once a time-zone difference that changed the “today” date in a date picker. I pin viewport, locale, and timezone when that matters, and I disable animations in screenshot specs.

“Works on my machine, red in CI” has been wrong baseURL, http vs https, or secrets that exist locally but not in the workflow. I log baseURL and the first few request URLs in a one-off debug job when I need proof.

2FA and real IdP flows I avoid in every test when I can. I use a test tenant, mock the broker, or capture storageState in a setup project so the main suite stays fast and deterministic.

Hangs after navigation showed up for me with a stubborn beforeunload handler and once with a redirect loop between /login and /app when a cookie was half-written. I used page.waitForURL with an explicit pattern and read the network panel in the trace until I saw the loop.

When a test is flaky, my first move is still the same: I open the trace, find a red request or console error, and I align the locator to what actually rendered in the snapshot—not what I remember the DOM should be.

How I compare Cypress, Selenium, and Playwright in one breath

I have seen Cypress win when the whole team already thinks in that runner, the suite is browser JavaScript-heavy, and a single-engine mental model matches the product. I have seen Selenium keep its seat when a company’s entire QA org and Grid are non-Node and migration would cost more than a new feature. I keep picking Playwright when I want WebKit in CI, network and trace features without bolting on extras, and parallel workers that do not need a story every time I add a file.

All three can succeed. In my experience the failure mode is under-investing in locator strategy and data isolation, not the logo on the box.

Real project structure (example)


repo/

  e2e/

    fixtures.ts

    auth.setup.ts

    example.spec.ts

    account/

      profile.spec.ts

    pages/

      AccountPage.ts

      CheckoutPage.ts

    utils/

      api.ts

  playwright.config.ts

  package.json

  .github/

    workflows/

      playwright.yml

  playwright/.auth/      # .gitignore — generated session files

  test-results/          # .gitignore — CI output

  playwright-report/     # .gitignore — HTML report

What I do in the repo

I keep e2e/ for tests and page objects only, not app source, so bundlers and TypeScript paths stay clean.
I add playwright/.auth to .gitignore when I store ephemeral login state. I have seen teams check in read-only smoke sessions for a fake test tenant—I treat that as a policy call, not a default.
utils/api.ts is where I keep small wrappers around request to create users, clear carts, and reset flags—short and boring on purpose.

Production-adjacent testing patterns (guardrails)

Preview deployments: I run the same suite against every PR’s URL when I can.
Staging smoke: after deploy, I run a shorter subset; I tag tests @smoke and filter.
Synthetic monitoring: I schedule Playwright (or lighter HTTP checks) against canary endpoints. I do not hit production write paths without safeguards, rate limits, and explicit off-hours windows.
Feature flags: I assert both on/off in separate projects or tagged tests when flags gate revenue-critical flows.

E2E against true production is rare for write flows in my work. I prefer read-only smoke, and I separate RUM + SLOs for real user health.

[Cypress E2E Testing | Selectors· cy.intercept](/en/blog/cypress-e2e-testing-guide/
[Jest Complete Guide | JavaScript Testing· Mocking](/en/blog/jest-complete-guide/
[Vitest Complete Guide | Unit Testing· Mocking](/en/blog/vitest-complete-guide/
[Vitest Browser Mode](/en/blog/vitest-browser-mode-testing-guide/ — in-browser unit/component tests vs full E2E

FAQ (from this guide’s front matter)

Playwright vs Cypress? — I lean on Playwright when I need multi-browser runs, strong parallelism, and trace debugging with a uniform test model. I still consider Cypress when the team’s DX and single-browser model are already a better fit. I decide based on skill, what is already in the repo, and whether I need WebKit in CI.

Why are my tests flaky? — I drop fixed sleeps; I use locators with auto-wait, I tie assertions to waitForResponse, and I stabilize data with route interception or seeded backends. I reuse storageState for auth instead of logging in for every file.

Do I run Playwright against production? — Usually I aim at staging, preview deploys, or canary URLs — not user traffic. For production health, I lean on RUM and synthetic checks that are explicitly read-only and safe; I do not treat Playwright in production as a load testing tool.

이 글이 도움이 되셨나요?

여러분의 피드백은 더 나은 콘텐츠를 만드는 데 도움이 됩니다

문제가 있거나 개선 제안이 있으시면 연락처로 알려주세요.

Keyboard Shortcuts