Selenium Grid 4 Docker and Kubernetes Complete Guide 2026
Master Selenium Grid 4 with Docker Compose and Kubernetes Helm charts. Cover hubs, nodes, dynamic sessions, video recording, KEDA autoscaling, and CI patterns.
Selenium Grid 4 Docker and Kubernetes Complete Guide 2026
Selenium Grid 4 is the distributed Selenium runtime: a coordinator (formerly called the hub) plus a set of nodes that host browsers. Tests connect to the coordinator endpoint, the coordinator selects an available node with the requested browser, and the test runs there. This pattern lets you run hundreds of parallel browser tests against a fixed pool of node containers, scale that pool horizontally on Docker hosts or Kubernetes, and isolate every test in its own browser process.
This guide covers Selenium Grid 4 end-to-end in 2026. We walk through the new modular architecture (Standalone, Hub-Node, Distributed, Fully Distributed), Docker Compose setups for local development, Kubernetes deployment with the Selenium Helm chart, browser node configuration, dynamic grids that spawn nodes per session, video recording, KEDA-based autoscaling, BiDi over WebSocket, and CI integration patterns. We compare to alternatives like Selenoid, Moon, and BrowserStack Automate. For more on protocols see Selenium BiDi protocol guide and Selenium Manager, and browse the skills directory.
Why Grid 4
Three reasons. First, parallelism. Modern test suites have hundreds or thousands of cases. Running them sequentially takes hours; running them in parallel against Grid takes minutes. Second, browser version coverage. Grid lets you maintain a pool with Chrome 122, 123, 124, Firefox latest, Safari, and Edge simultaneously. Each test requests the browser-version combo it needs. Third, isolation. Each test gets its own clean browser process. State leaks between tests disappear because the browser is killed after each session.
Selenium Grid 4 made significant improvements over Grid 3: a redesigned coordinator architecture, BiDi over WebSocket support, Observability via OpenTelemetry, and Helm-friendly deployment. Migration from Grid 3 is mostly mechanical but worth doing now if you have not.
| Component | Role |
|---|---|
| Coordinator (Hub) | Routes new sessions to nodes |
| Distributor | Selects which node gets a new session |
| Router | Forwards commands to the right session |
| Session Queue | Holds requests when all nodes are busy |
| Event Bus | Internal messaging between components |
| Node | Hosts browsers, registers with coordinator |
| Session Map | Tracks active sessions |
Deployment Modes
Grid 4 supports four deployment modes:
- Standalone. All components in one process. Good for local dev or tiny teams.
- Hub and Node. Classic Grid 3 layout. Coordinator runs everything except browsers.
- Distributed. Each component runs as a separate process. For larger scale.
- Fully Distributed with Kubernetes. Each component as a Pod. Production scale.
Most teams start with Standalone, graduate to Hub-Node as test counts grow, then move to Kubernetes when they need autoscaling.
Docker Compose: Hub and Nodes
The simplest production-ready setup is Hub and Node via Docker Compose. The Selenium project publishes official images that handle most of the complexity.
# docker-compose.yml
version: '3.8'
services:
selenium-hub:
image: selenium/hub:4.27.0
ports:
- "4442:4442"
- "4443:4443"
- "4444:4444"
environment:
- GRID_MAX_SESSION=50
- GRID_BROWSER_TIMEOUT=300
chrome:
image: selenium/node-chrome:4.27.0
shm_size: 2gb
deploy:
replicas: 5
depends_on:
- selenium-hub
environment:
- SE_EVENT_BUS_HOST=selenium-hub
- SE_EVENT_BUS_PUBLISH_PORT=4442
- SE_EVENT_BUS_SUBSCRIBE_PORT=4443
- SE_NODE_MAX_SESSIONS=5
- SE_NODE_SESSION_TIMEOUT=300
firefox:
image: selenium/node-firefox:4.27.0
shm_size: 2gb
deploy:
replicas: 3
depends_on:
- selenium-hub
environment:
- SE_EVENT_BUS_HOST=selenium-hub
- SE_EVENT_BUS_PUBLISH_PORT=4442
- SE_EVENT_BUS_SUBSCRIBE_PORT=4443
edge:
image: selenium/node-edge:4.27.0
shm_size: 2gb
deploy:
replicas: 2
depends_on:
- selenium-hub
environment:
- SE_EVENT_BUS_HOST=selenium-hub
- SE_EVENT_BUS_PUBLISH_PORT=4442
- SE_EVENT_BUS_SUBSCRIBE_PORT=4443
Spin up:
docker compose up --scale chrome=5 --scale firefox=3
# Then open http://localhost:4444 to see the Grid UI
This gives you 5 Chrome nodes, 3 Firefox nodes, 2 Edge nodes, each with 5 sessions. Total parallel browser sessions: 50 (5 nodes * 5 sessions Chrome) + 15 + 10 = 75.
shm_size: 2gb matters. Without it Chrome crashes from /dev/shm exhaustion in headless mode.
Connecting Tests
Tests connect to http://localhost:4444 (or your remote Grid coordinator URL) using RemoteWebDriver. Capabilities specify which browser and platform you want.
// Java with TestNG
ChromeOptions options = new ChromeOptions();
options.setCapability("browserVersion", "122");
options.setCapability("platformName", "linux");
options.setCapability("se:name", "Login test " + testName);
options.setCapability("se:recordVideo", true);
options.setCapability("se:screenResolution", "1920x1080");
WebDriver driver = new RemoteWebDriver(
new URL("http://grid.example.com:4444/wd/hub"),
options
);
driver.get("https://staging.example.com");
# Python with pytest
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.set_capability("browserVersion", "122")
options.set_capability("se:name", "Checkout test")
options.set_capability("se:recordVideo", True)
driver = webdriver.Remote(
command_executor="http://grid.example.com:4444/wd/hub",
options=options,
)
driver.get("https://staging.example.com")
se:name shows up in the Grid UI for live debugging. se:recordVideo enables video capture if the video node is running.
Kubernetes Deployment
For production scale, deploy Grid to Kubernetes using the official Helm chart.
helm repo add docker-selenium https://www.selenium.dev/docker-selenium
helm repo update
# Install with defaults
helm install grid docker-selenium/selenium-grid \
--namespace selenium-grid --create-namespace
The chart deploys all Grid components as separate Deployments. Default resource limits are modest. For real workloads override via values.yaml:
# values.yaml
isolateComponents: true
ingress:
enabled: true
hostname: grid.example.com
basicAuth:
enabled: true
username: ${{ secrets.GRID_USER }}
password: ${{ secrets.GRID_PASS }}
chromeNode:
replicas: 10
hpa:
enabled: true
minReplicas: 2
maxReplicas: 50
targetCPU: 70
resources:
requests:
cpu: 1
memory: 2Gi
limits:
cpu: 2
memory: 4Gi
firefoxNode:
replicas: 5
resources:
requests:
cpu: 1
memory: 2Gi
edgeNode:
replicas: 3
videoRecorder:
enabled: true
global:
seleniumGrid:
nodeMaxSessions: 4
sessionRequestTimeout: 300
Install with custom values:
helm install grid docker-selenium/selenium-grid \
--namespace selenium-grid --create-namespace \
-f values.yaml
This gives you 10 Chrome nodes with HPA scaling to 50, 5 Firefox, 3 Edge, with video recording enabled. The Ingress exposes the Grid endpoint with basic auth.
KEDA Autoscaling
HPA scales on CPU, which doesn't always reflect actual demand for Grid sessions. KEDA (Kubernetes Event-Driven Autoscaling) can scale Grid nodes based on queue length: if there are pending session requests, spin up more nodes.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: selenium-chrome-scaler
namespace: selenium-grid
spec:
scaleTargetRef:
name: selenium-chrome-node
minReplicaCount: 2
maxReplicaCount: 100
pollingInterval: 10
cooldownPeriod: 300
triggers:
- type: selenium-grid
metadata:
url: http://selenium-hub.selenium-grid.svc:4444/graphql
browserName: chrome
sessionsPerNode: 4
activationThreshold: 1
KEDA queries the Grid GraphQL endpoint, gets the count of pending sessions, and adjusts replicas accordingly. This scales much more cleanly than CPU-based HPA for browser workloads.
Dynamic Grid
Selenium Grid 4 supports a "dynamic" mode where the coordinator pulls a Docker image and starts a fresh container per session. The container is destroyed after the session ends. This gives perfect isolation at the cost of slower per-session startup.
# config.toml on the node
[node]
detect-drivers = false
[docker]
configs = [
"selenium/standalone-chrome:122.0", "{\"browserName\": \"chrome\", \"browserVersion\": \"122\"}",
"selenium/standalone-firefox:latest", "{\"browserName\": \"firefox\"}",
"selenium/standalone-edge:latest", "{\"browserName\": \"MicrosoftEdge\"}"
]
host-config-keys = ["Dns", "DnsOptions", "DnsSearch", "ExtraHosts"]
video-image = "selenium/video:latest"
assets-path = "/opt/selenium/assets"
When the coordinator receives a session request for Chrome 122, it pulls the selenium/standalone-chrome:122.0 image and runs it. After the session ends, the container exits. This is the model used by Selenoid and Moon as well.
Video Recording
Grid 4 ships with a video recording sidecar. When you request se:recordVideo in capabilities, the node spins up a recording container that captures the browser's video output. The MP4 is written to a shared volume.
videoRecorder:
enabled: true
uploader:
enabled: true
name: aws-s3
s3:
bucket: my-selenium-videos
region: us-east-1
key: ${{ secrets.AWS_KEY }}
secret: ${{ secrets.AWS_SECRET }}
Configure the uploader to push completed videos to S3 (or Azure Blob, GCS). Videos are typically 1-10 MB per test and named by session ID for easy lookup when a test fails.
BiDi Over WebSocket
Selenium Grid 4 supports BiDi (Bidirectional protocol) for WebDriver. This lets your test subscribe to browser events (console logs, network requests, JS errors) in real time.
// JavaScript with selenium-webdriver
const { Builder, By } = require('selenium-webdriver');
(async () => {
const driver = await new Builder()
.usingServer('http://grid.example.com:4444/wd/hub')
.forBrowser('chrome')
.build();
const session = await driver.session();
const bidiSocket = await driver.getBidi();
// Subscribe to console logs
await bidiSocket.subscribe('log.entryAdded');
bidiSocket.on('log.entryAdded', (entry) => {
console.log('Browser console:', entry.text);
});
await driver.get('https://example.com');
})();
See Selenium BiDi protocol guide for full coverage.
Observability
Grid 4 emits OpenTelemetry spans and metrics. Pipe them into your existing observability stack:
# values.yaml
tracing:
enabled: true
exporter:
type: otlp
endpoint: http://otel-collector:4317
metrics:
enabled: true
exporter:
prometheus:
enabled: true
port: 9090
This gives you per-session traces, queue length metrics, and node CPU/memory metrics in your standard Grafana dashboards.
CI Integration
Standard pattern with GitHub Actions:
name: Selenium Tests
on:
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
services:
selenium-hub:
image: selenium/hub:4.27.0
ports: ['4444:4444', '4442:4442', '4443:4443']
chrome-node:
image: selenium/node-chrome:4.27.0
env:
SE_EVENT_BUS_HOST: selenium-hub
SE_EVENT_BUS_PUBLISH_PORT: 4442
SE_EVENT_BUS_SUBSCRIBE_PORT: 4443
SE_NODE_MAX_SESSIONS: 4
options: --shm-size=2gb
steps:
- uses: actions/checkout@v4
- name: Wait for Grid
run: |
until curl -sSL "http://localhost:4444/wd/hub/status" | jq -r '.value.ready' | grep "true"; do
sleep 1
done
- uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: '17'
- name: Run tests
run: |
mvn test -Dgrid.url=http://localhost:4444 -DforkCount=4
For larger CI fleets, run a persistent Grid in your test cluster and have CI pipelines connect to it. This avoids the per-job Grid spin-up cost.
Alternatives: Selenoid and Moon
For teams that find Grid 4 heavyweight, the Selenoid (open source) and Moon (commercial, Kubernetes-native) alternatives offer similar capabilities with a different architecture. They use per-session containers like Dynamic Grid by default.
| Tool | License | Strength |
|---|---|---|
| Selenium Grid 4 | Apache 2.0 | Official, BiDi support, Helm chart |
| Selenoid | Apache 2.0 | Faster startup, lightweight |
| Moon | Commercial | Kubernetes-native, scaling features |
| BrowserStack Automate | Commercial SaaS | No infrastructure |
| Sauce Labs | Commercial SaaS | No infrastructure |
Common Issues
Five gotchas teams hit:
- shm_size too small. Chrome crashes in headless. Set to 2gb minimum.
- Hostname resolution. Nodes need to reach the coordinator hostname. In Kubernetes use Service names; in Docker Compose use container names.
- Browser version mismatch. Tests requesting an unavailable browserVersion hang in the queue. Use platform-agnostic capabilities or pin to known-good versions.
- Session timeout. Default 300s might be too short for long tests. Tune
SE_NODE_SESSION_TIMEOUT. - Network policies block GraphQL. KEDA needs to query the Grid GraphQL endpoint. Ensure your NetworkPolicy allows it.
Conclusion
Selenium Grid 4 is the right distributed Selenium runtime for teams in 2026. Helm-friendly deployment, BiDi over WebSocket, dynamic per-session containers, and KEDA autoscaling make it production-grade out of the box. Compared to commercial SaaS alternatives like BrowserStack and Sauce Labs, Grid gives you full control at a fraction of the cost when you have the operational capacity.
If you are starting from scratch, run the Docker Compose example above to learn the basics, then move to Kubernetes via Helm chart for production. Browse the skills directory for Selenium AI agent skills and read Selenium Manager for browser driver management.