Chromium / ChromeDriver 完全指南 / 11 - Docker 与集群部署
11 - Docker 与集群部署
在 Docker 容器中运行无头浏览器,使用 Selenium Grid 搭建分布式测试集群,实现高可用的浏览器自动化基础设施。
11.1 为什么在 Docker 中运行浏览器
| 优势 | 说明 |
|---|---|
| 环境一致性 | 开发、测试、生产环境完全一致 |
| 隔离性 | 每个浏览器实例独立,互不影响 |
| 可扩展性 | 快速创建/销毁实例,水平扩展 |
| CI/CD 集成 | 天然适合 CI/CD 流水线 |
| 资源控制 | CPU、内存、网络精确限制 |
| 维护简便 | 镜像版本化,回滚方便 |
11.2 Docker 中的 Chrome 基础
Dockerfile
FROM ubuntu:22.04
# 安装系统依赖
RUN apt-get update && apt-get install -y \
wget \
gnupg \
ca-certificates \
fonts-noto-cjk \
fonts-liberation \
libnss3 \
libnspr4 \
libatk1.0-0 \
libatk-bridge2.0-0 \
libcups2 \
libdrm2 \
libxkbcommon0 \
libxcomposite1 \
libxdamage1 \
libxfixes3 \
libxrandr2 \
libgbm1 \
libpango-1.0-0 \
libcairo2 \
libasound2 \
libatspi2.0-0 \
xdg-utils \
&& rm -rf /var/lib/apt/lists/*
# 安装 Google Chrome
RUN wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | gpg --dearmor -o /usr/share/keyrings/google-chrome.gpg \
&& echo "deb [arch=amd64 signed-by=/usr/share/keyrings/google-chrome.gpg] http://dl.google.com/linux/chrome/deb/ stable main" \
> /etc/apt/sources.list.d/google-chrome.list \
&& apt-get update \
&& apt-get install -y google-chrome-stable \
&& rm -rf /var/lib/apt/lists/*
# 安装 ChromeDriver (115+)
RUN CHROME_VERSION=$(google-chrome --version | grep -oP '\d+\.\d+\.\d+\.\d+') \
&& CHROME_MAJOR=$(echo $CHROME_VERSION | cut -d. -f1) \
&& DRIVER_VERSION=$(wget -qO- "https://googlechromelabs.github.io/chrome-for-testing/LATEST_RELEASE_${CHROME_MAJOR}") \
&& wget -q "https://storage.googleapis.com/chrome-for-testing-public/${DRIVER_VERSION}/linux64/chromedriver-linux64.zip" \
&& unzip chromedriver-linux64.zip \
&& mv chromedriver-linux64/chromedriver /usr/local/bin/ \
&& chmod +x /usr/local/bin/chromedriver \
&& rm -rf chromedriver-linux64*
# 创建非 root 用户
RUN useradd -m -s /bin/bash chromeuser
USER chromeuser
# 验证安装
RUN google-chrome --version && chromedriver --version
构建与运行
# 构建镜像
docker build -t chrome-automation .
# 运行容器
docker run -it --rm \
--shm-size=2g \
-v $(pwd)/scripts:/scripts \
chrome-automation \
python /scripts/test.py
11.3 Selenium Grid
Selenium Grid 是 Selenium 官方的分布式测试方案,支持多节点、多浏览器并行测试。
架构
┌──────────────────────────────────────────────────────────┐
│ Selenium Grid │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Hub / Router │ │
│ │ (接收请求,分发到合适的节点) │ │
│ └────────┬──────────────┬──────────────┬──────────────┘ │
│ │ │ │ │
│ ┌────────▼──────┐ ┌─────▼──────┐ ┌────▼──────────┐ │
│ │ Node Chrome │ │ Node │ │ Node Edge │ │
│ │ (192.168.1.2)│ │ Firefox │ │ (192.168.1.4)│ │
│ │ 最多 5 个会话 │ │ (192.168.1.3)│ │ 最多 3 个会话│ │
│ └───────────────┘ └────────────┘ └───────────────┘ │
│ │
└──────────────────────────────────────────────────────────┘
Selenium Grid 组件 (v4)
| 组件 | 说明 | 端口 |
|---|---|---|
| Router | 入口,路由请求到正确的组件 | 4444 |
| Distributor | 注册节点,分配会话 | 5553 |
| Session Map | 维护会话与节点映射 | 5553 |
| Session Queue | 等待中的会话队列 | 5555 |
| Event Bus | 内部事件总线 | 4442/4443 |
| Node | 浏览器运行节点 | 5555 |
11.4 Docker Compose 部署 Selenium Grid
基础配置
# docker-compose.yml
version: "3.8"
services:
selenium-hub:
image: selenium/hub:4.16
container_name: selenium-hub
ports:
- "4444:4444" # Grid UI & WebDriver API
- "4442:4442" # Event Bus publish
- "4443:4443" # Event Bus subscribe
- "5553:5553" # Distributor / Session Map
environment:
- SE_LOG_LEVEL=INFO
- SE_SESSION_REQUEST_TIMEOUT=300
- SE_SESSION_RETRY_INTERVAL=2
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:4444/wd/hub/status"]
interval: 10s
timeout: 5s
retries: 5
chrome-node:
image: selenium/node-chrome:4.16
container_name: chrome-node
depends_on:
selenium-hub:
condition: service_healthy
environment:
- SE_EVENT_BUS_HOST=selenium-hub
- SE_EVENT_BUS_PUBLISH_PORT=4442
- SE_EVENT_BUS_SUBSCRIBE_PORT=4443
- SE_NODE_MAX_SESSIONS=3
- SE_NODE_OVERRIDE_MAX_SESSIONS=true
- SE_VNC_NO_PASSWORD=1
shm_size: "2gb"
ports:
- "7900:7900" # noVNC (调试用)
firefox-node:
image: selenium/node-firefox:4.16
container_name: firefox-node
depends_on:
selenium-hub:
condition: service_healthy
environment:
- SE_EVENT_BUS_HOST=selenium-hub
- SE_EVENT_BUS_PUBLISH_PORT=4442
- SE_EVENT_BUS_SUBSCRIBE_PORT=4443
- SE_NODE_MAX_SESSIONS=3
- SE_NODE_OVERRIDE_MAX_SESSIONS=true
shm_size: "2gb"
ports:
- "7901:7900"
启动
# 启动 Grid
docker compose up -d
# 查看状态
docker compose ps
# 查看 Grid 状态 (Web UI)
open http://localhost:4444
# 查看浏览器实例 (noVNC, 密码: secret)
open http://localhost:7900
连接到 Grid
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--headless=new")
options.add_argument("--no-sandbox")
# 连接到 Selenium Grid
driver = webdriver.Remote(
command_executor="http://localhost:4444/wd/hub",
options=options
)
driver.get("https://example.com")
print(f"标题: {driver.title}")
print(f"Session ID: {driver.session_id}")
driver.quit()
11.5 完整测试环境 Compose
# docker-compose.test.yml
version: "3.8"
services:
# 应用服务
web-app:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=test
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 5s
timeout: 3s
retries: 10
# Selenium Grid
selenium-hub:
image: selenium/hub:4.16
ports:
- "4444:4444"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:4444/wd/hub/status"]
interval: 10s
timeout: 5s
retries: 5
chrome:
image: selenium/node-chrome:4.16
depends_on:
selenium-hub:
condition: service_healthy
environment:
- SE_EVENT_BUS_HOST=selenium-hub
- SE_EVENT_BUS_PUBLISH_PORT=4442
- SE_EVENT_BUS_SUBSCRIBE_PORT=4443
- SE_NODE_MAX_SESSIONS=5
shm_size: "2gb"
# 测试运行器
test-runner:
build:
context: .
dockerfile: Dockerfile.test
depends_on:
web-app:
condition: service_healthy
selenium-hub:
condition: service_healthy
environment:
- SELENIUM_HUB_URL=http://selenium-hub:4444/wd/hub
- APP_URL=http://web-app:3000
volumes:
- ./test-results:/app/test-results
# Dockerfile.test
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY tests/ ./tests/
COPY conftest.py .
CMD ["pytest", "tests/", "--html=test-results/report.html", "-v"]
11.6 Playwright Docker 部署
Playwright Docker 镜像
FROM mcr.microsoft.com/playwright:v1.40.0-jammy
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npx playwright install
CMD ["npx", "playwright", "test"]
Playwright Compose
version: "3.8"
services:
web-app:
build: .
ports:
- "3000:3000"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 5s
timeout: 3s
retries: 10
playwright:
image: mcr.microsoft.com/playwright:v1.40.0-jammy
depends_on:
web-app:
condition: service_healthy
working_dir: /app
volumes:
- .:/app
- node_modules:/app/node_modules
environment:
- BASE_URL=http://web-app:3000
- CI=1
command: npx playwright test
shm_size: "2gb"
volumes:
node_modules:
11.7 Selenium Grid 自动伸缩
Kubernetes 部署
# selenium-grid-k8s.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: selenium-hub
spec:
replicas: 1
selector:
matchLabels:
app: selenium-hub
template:
metadata:
labels:
app: selenium-hub
spec:
containers:
- name: selenium-hub
image: selenium/hub:4.16
ports:
- containerPort: 4444
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
readinessProbe:
httpGet:
path: /wd/hub/status
port: 4444
initialDelaySeconds: 10
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: selenium-hub
spec:
selector:
app: selenium-hub
ports:
- port: 4444
targetPort: 4444
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: selenium-chrome-node
spec:
replicas: 3
selector:
matchLabels:
app: selenium-chrome
template:
metadata:
labels:
app: selenium-chrome
spec:
containers:
- name: chrome
image: selenium/node-chrome:4.16
env:
- name: SE_EVENT_BUS_HOST
value: "selenium-hub"
- name: SE_EVENT_BUS_PUBLISH_PORT
value: "4442"
- name: SE_EVENT_BUS_SUBSCRIBE_PORT
value: "4443"
- name: SE_NODE_MAX_SESSIONS
value: "3"
resources:
requests:
memory: "1Gi"
cpu: "1000m"
limits:
memory: "2Gi"
cpu: "2000m"
volumeMounts:
- name: shm
mountPath: /dev/shm
volumes:
- name: shm
emptyDir:
medium: Memory
sizeLimit: "2Gi"
# 部署到 Kubernetes
kubectl apply -f selenium-grid-k8s.yaml
# 扩展 Chrome 节点
kubectl scale deployment selenium-chrome-node --replicas=5
# 查看状态
kubectl get pods -l app=selenium-chrome
使用 KEDA 自动伸缩
# keda-scaledobject.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: selenium-chrome-scaledobject
spec:
scaleTargetRef:
name: selenium-chrome-node
minReplicaCount: 1
maxReplicaCount: 10
triggers:
- type: selenium-grid
metadata:
url: http://selenium-hub:4444/graphql
browserName: chrome
unsafeSsl: "true"
11.8 容器资源调优
资源分配建议
| 组件 | CPU | 内存 | 说明 |
|---|---|---|---|
| Selenium Hub | 0.5-1 核 | 512MB-1GB | 路由和管理 |
| Chrome Node (单实例) | 1-2 核 | 1-2GB | 含 /dev/shm |
| Chrome Node (3 会话) | 2-4 核 | 2-4GB | 多会话共享 |
| Firefox Node | 1-2 核 | 1-2GB | 比 Chrome 略少 |
| Playwright Runner | 2-4 核 | 2-4GB | 含所有浏览器 |
/dev/shm 调优
# docker-compose.yml
services:
chrome:
image: selenium/node-chrome:4.16
# 方法 1: 使用 tmpfs
tmpfs:
- /dev/shm:size=2g
# 方法 2: 使用 shm_size (等同于 --shm-size)
shm_size: "2gb"
内存限制与 OOM 处理
services:
chrome:
image: selenium/node-chrome:4.16
deploy:
resources:
limits:
cpus: "2.0"
memory: 2G
reservations:
cpus: "1.0"
memory: 1G
environment:
- SE_JAVA_OPTS=-Xmx1g -XX:+UseG1GC
11.9 监控与日志
Grid 状态监控
# Grid 状态 API
curl http://localhost:4444/wd/hub/status
# GraphQL 查询
curl -X POST http://localhost:4444/graphql \
-H "Content-Type: application/json" \
-d '{"query": "{ grid { sessionCount, maxSession, nodes { id, status, sessions { id } } } }"}'
Docker 日志聚合
# docker-compose.yml
services:
selenium-hub:
image: selenium/hub:4.16
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
11.10 要点回顾
| 要点 | 说明 |
|---|---|
/dev/shm 必须 ≥ 2GB | Chrome 渲染需要共享内存 |
| Selenium Grid 4 架构更清晰 | Router → Distributor → Node |
| Compose 一键启动 | Hub + 多种浏览器节点 + 测试运行器 |
| K8s + KEDA 自动伸缩 | 根据会话数自动调整节点数 |
| noVNC 调试 | 通过 Web 浏览器查看容器中的浏览器 |
| Playwright 官方镜像 | mcr.microsoft.com/playwright 开箱即用 |
11.11 注意事项
⚠️
/dev/shm不足导致 OOM: 这是 Docker 中最常见的 Chrome 崩溃原因,务必设置shm_size: 2gb。⚠️ 网络隔离: Compose 中服务通过服务名访问,不要使用
localhost。⚠️ 镜像版本锁定: 生产环境锁定 Selenium 镜像版本,避免 Grid 和节点版本不匹配。
⚠️ 健康检查: 确保 Hub 和 Web 应用的健康检查正确配置,否则节点可能在 Hub 就绪前尝试连接。
11.12 扩展阅读
| 资源 | 链接 |
|---|---|
| Selenium Docker 镜像 | https://github.com/SeleniumHQ/docker-selenium |
| Selenium Grid 文档 | https://www.selenium.dev/documentation/grid/ |
| Selenium Grid 配置 | https://www.selenium.dev/documentation/grid/configuration/ |
| Playwright Docker | https://playwright.dev/docs/docker |
| KEDA Selenium Scaler | https://keda.sh/docs/2.12/scalers/selenium-grid-scaler/ |