Back to Blog
Post

How to Run Playwright Tests in Parallel on AWS Spot Instances

TraceLoom Team ·

Running 1,000 Playwright tests sequentially is slow. Running them in parallel on managed CI platforms is expensive. Running them in parallel on AWS Spot instances inside your own account is fast, cheap, and keeps your test data inside your VPC.

This post covers how to do it — the architecture, the sharding strategy, interruption handling, and what it costs in practice.

Why Spot instances for Playwright?

Playwright tests are embarrassingly parallel. Each test is isolated, stateless, and can run on any machine. This makes them a near-perfect workload for EC2 Spot instances:

  • Short-lived — most test runs complete in 2–15 minutes, well under interruption risk windows
  • Restartable — if a Spot instance is reclaimed, individual tests can be retried on another worker
  • Compute-bound — browsers are CPU and memory intensive; Spot gives you large instances at 60–90% discount

At a typical run of 847 tests across 50 workers, the AWS Spot cost comes to roughly $0.23. The same run on GitHub Actions (runner minutes) runs $8–15. On BrowserStack parallel sessions, it’s $40–80.

The architecture

The core pattern is straightforward:

CI trigger → SQS message → Lambda → EC2 Spot fleet (50 workers)
                                         |
                                    Shard test list
                                         |
                               Run Playwright on each shard
                                         |
                                  Upload trace.zip to S3

Each EC2 worker:

  1. Pulls its shard assignment from the orchestrator
  2. Runs playwright test on that subset with trace: 'on'
  3. Uploads trace.zip to your S3 bucket
  4. Reports status back

Instance sizing

For Playwright, c6i.xlarge (4 vCPU, 8 GB RAM) handles ~20 tests per worker comfortably. For large suites with heavy DOM operations, c6i.2xlarge (8 vCPU, 16 GB) lets you push to 40–50 tests per worker.

Spot pricing as of early 2026:

  • c6i.xlarge: ~$0.04–0.06/hour
  • c6i.2xlarge: ~$0.08–0.12/hour

A 50-worker fleet of c6i.xlarge running for 5 minutes costs approximately $0.17.

Sharding strategy

Playwright ships a native --shard flag:

playwright test --shard=1/50
playwright test --shard=2/50
# ...
playwright test --shard=50/50

This distributes tests evenly by file. For most suites it works well. For suites with highly variable test duration (some tests take 2 seconds, others take 45 seconds), you’ll want duration-aware sharding — tracking previous run durations and bin-packing shards to balance total elapsed time per worker.

In TraceLoom, shard assignments are computed before the fleet launches. The orchestrator Lambda reads the test list, checks DynamoDB for historical durations, and assigns shards to minimize the longest worker runtime (the critical path).

Handling Spot interruptions

Spot instances can be reclaimed with a 2-minute warning. For Playwright, the mitigation is simple:

  1. Heartbeat to SQS — each worker sends a heartbeat every 30 seconds. If the orchestrator misses 3 heartbeats, it marks that shard as failed.
  2. Requeue on interruption — the 2-minute warning triggers a graceful shutdown script that pushes the incomplete shard back to SQS for another worker to pick up.
  3. Partial results — any traces already uploaded to S3 are preserved. Only the remaining tests in that shard need to be rerun.

In practice, Spot interruption rates for c6i instances in us-east-1 run below 5% across most availability zones. For 5-minute runs, the probability of interruption is under 1%.

Trace storage in S3

Every Playwright test should run with trace: 'on'. The resulting trace.zip contains:

  • DOM snapshots at each action step
  • Network requests and responses
  • Console logs
  • Screenshots on failure
  • A full action timeline

These files are typically 0.5–5 MB per test. For 1,000 tests per day, that’s 500 MB–5 GB of trace data daily. S3 storage costs for this volume are under $0.12/month in Standard storage. If you move traces to S3 Intelligent-Tiering after 7 days, costs drop further.

The upload script is minimal:

import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
import { createReadStream } from 'fs';

const s3 = new S3Client({ region: process.env.AWS_REGION });

await s3.send(new PutObjectCommand({
  Bucket: process.env.S3_BUCKET,
  Key: `runs/${runId}/${testName}/trace.zip`,
  Body: createReadStream(tracePath),
  ContentType: 'application/zip',
}));

Store the S3 key alongside your test result in DynamoDB so you can link directly to the trace from your test dashboard.

Cost at scale

Here’s the real math for a team running 1,000 tests/day in three CI runs:

ItemCost/month
EC2 Spot compute (50 workers × 5 min × 3 runs × 30 days)~$7
S3 storage (1.5 TB trace data, Standard)~$34
DynamoDB (on-demand, ~10M read/write units)~$6
SQS + Lambda (negligible at this scale)<$1
Total AWS costs~$48/month

Compare to GitHub Actions at 50 parallel jobs (if you could get that many): approximately $450/month in runner minutes. BrowserStack Automate at equivalent concurrency: $1,000+/month.

The savings fund the platform fee several times over.

Viewing traces after a run

The Playwright trace viewer can open any trace.zip directly:

# Download a specific test trace from S3
aws s3 cp s3://your-bucket/runs/r_8xq/checkout-flow/trace.zip .
npx playwright show-trace trace.zip

Or, if you’re using TraceLoom, every trace is linked directly from the run dashboard — click a failed test and the trace viewer opens in-browser.

The alternative: TraceLoom

Building this infrastructure yourself takes 2–4 weeks for a solid implementation. You need to handle:

  • VPC and subnet configuration for the Spot fleet
  • CDK or Terraform for the SQS/Lambda/EC2 stack
  • Bootstrap scripts for worker startup and Playwright installation
  • Shard assignment logic and duration-aware binpacking
  • Heartbeat and interruption handling
  • DynamoDB schema for run/test/shard records
  • A dashboard to view results and open traces

TraceLoom ships all of this as a CDK stack that deploys into your AWS account. Your tests run in your VPC, traces go to your S3 bucket, and you get a dashboard with trace viewer integration. The platform fee is $79/month. The first 500 runs/month are free.

If you’d rather run the infrastructure yourself, everything described in this post can be built from scratch — the architecture is solid. If you’d rather skip the build and start running tests today, TraceLoom is the fastest path.


Have questions about sharding strategy or Spot configuration for your test suite? Email us at hello@traceloom.io.