Docker lets you package all the complex components screenshot services depend on into a container that runs in any environment with Docker installed. But it's still hard to know which packages to include, or how to configure the container for maximum efficiency, security, and performance.
In this article, we're going to walk you through implementing a Docker image that serves a website screenshot capture API. We'll create the API using Puppeteer, then write the Dockerfile that builds the container image. We'll also highlight key best practices and potential gotchas along the way. Let's get started.
Why Use Docker for a Website Screenshot Service?
Website screenshot services programmatically capture content from webpages. They're typically built on web browser automation tools like Puppeteer. Puppeteer provides an API for controlling a Chrome or Firefox browser instance. You can use code to navigate to your target webpage, then save a screenshot capture.
Puppeteer can be challenging to use in traditional server environments. To successfully capture screenshots, you need a functioning browser installation. In turn, this requires several dependencies not normally found in a typical headless environment, such as a display server, graphics packages, and fonts.
Docker helps solve these problems. Packaging your code, Puppeteer, and browser dependencies as a Docker image lets you deploy your service with consistent results each time. Beyond reliable releases to production, Docker lets developers use tools like Docker Compose to easily start a local instance of the service. They don't have to manually install Chrome, Puppeteer, and all the related dependencies on their machines. Similarly, using a Docker image to power your CI pipelines and testing processes guarantees that the dependency versions used for testing will exactly match those deployed to production.
Tools You Need for a Dockerized Website Screenshot Service
Creating a Dockerized website screenshot service requires more than just Docker. In this guide, we'll use the following tools:
- Browser Control Library (Puppeteer): Puppeteer is a JavaScript library that provides an API for remotely controlling Chrome or Firefox. It interacts with the browser using the DevTools Protocol or WebDriver BiDi. Puppeteer defaults to using a headless browser instance, so the browser interface won't be displayed. This is ideal for server-side use in a screenshot service. If you prefer not to use Puppeteer, then Selenium is an alternative option you could try.
- Programming Language (Node.js): We're using Node.js as our screenshot service's programming language and runtime environment. Node.js is a good choice because Puppeteer is a JavaScript library; using JavaScript for our application code makes it possible to directly call native Puppeteer APIs. However, Puppeteer bridge layers are available for many other popular programming languages.
- HTTP Server (Express): You need an HTTP server and web framework if you want to expose your screenshot service as a web API for users to interact with. Express is a popular framework for Node.js.
- Docker: Docker is a complete toolkit for building and running containers. It provides a convenient interface to lower level technologies such as the BuildKit image building system and Containerd container runtime. Docker creates OCI-compliant images that work with any container platform, whether Docker itself, an alternative like Podman, or orchestrators such as Kubernetes.
With the tool list out of the way, let's begin building our Dockerized website screenshot service.
Guide: Building a Docker Image for a Website Screenshot Service
To get started building the service, first use npm to install the dependencies we'll be using: Puppeteer and Express.
$ npm install puppeteer express
Now we’ll walk through creating the image. If you just want to see and run the code, then you can find all the project files on GitHub.
1. Creating the API Code
First, save the following code as main.js
in your working directory. The code uses Express to serve an HTTP API on port 3000. The API provides a /capture
endpoint that invokes Puppeteer to screenshot the webpage specified by the url
query parameter. For example, sending a request to localhost:3000/capture?url=https://www.google.com
will save a screenshot of Google's homepage.
const express = require("express");
const puppeteer = require("puppeteer");
const app = express();
const appPort = 3000;
(async () => {
const browser = await puppeteer.launch();
process.on("beforeExit", async () => {
await browser.close();
});
app.get("/capture", async (req, res) => {
const page = await browser.newPage();
await page.goto(req.query.url);
await page.screenshot({path: `/captures/${Date.now()}.png`});
await page.close();
res.status(204).send();
`});`
app.listen(appPort, () => {
console.log(`Service is listening on port ${appPort}...`);
});
})();
There are a few important points to note in this code:
- The browser instance is created as soon as the process starts, using
puppeteer.launch()
. This ensures the browser is ready to use when requests arrive, improving performance and enabling multiple connections to be served by the same instance. - The
beforeExit
Node.js event is used to close the browser instance before the Node.js process terminates. This prevents redundant browser instances from accumulating. - The
/capture
endpoint uses Puppeteer's APIs to open a new page (tab) in the headless browser instance, navigate to the requested URL, and then save the screenshot into the/captures
directory in the filesystem. We'll mount a Docker volume to this path in our container later on.
Now you're ready to create the Docker image that'll run your service.
2. Creating the Dockerfile
A Dockerfile is a list of instructions that describe how to build a Docker image. It assembles the image's filesystem by copying files from your Docker host, running commands, and setting metadata. This article isn’t be an exhaustive guide to Dockerfile features, but you can find more detailed information within Docker's documentation.
To continue, save the following Dockerfile as Dockerfile
in your project's working directory—we'll explain each instruction below:
FROM node:22
EXPOSE 3000
WORKDIR /app
VOLUME /captures
ENV PUPPETEER_SKIP_DOWNLOAD=true
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium
ENV XDG_CACHE_HOME=/tmp/.chromium
ENV XDG_CONFIG_HOME=/tmp/.chromium
RUN apt-get update &&\
apt-get install -y \
chromium \
libasound2 \
libatk-bridge2.0 \
libatk1.0 \
libcups2 \
libdbus-1-3 \
libgbm1 \
libjpeg62-turbo \
libnss3 \
libpng16-16 \
libxcomposite1 \
libxdamage1 \
libxkbcommon0 \
libxrandr2
COPY package.json .
COPY package-lock.json .
RUN npm ci
COPY main.js .
USER 1001
CMD ["main.js"]
Let's now walk through the Dockerfile line-by-line.
FROM node:22
This line sets the base image to the official node:22
image available on Docker Hub. It includes the Node.js v22 LTS release and comes with npm already installed.
EXPOSE 3000
This metadata line indicates that the service in the container will listen on port 3000. This is the port we specified in our code.
WORKDIR /app
This line specifies the container working directory to be used for the following COPY
instructions.
VOLUME /captures
The /captures
path in the container is designated as a mount point so that generated screenshots will be automatically stored outside the container, on your Docker host. We'll discuss volume mounts in more detail below.
ENV PUPPETEER_SKIP_DOWNLOAD
and ENV PUPPETEER_EXECUTABLE_PATH
Setting the PUPPETEER_SKIP_DOWNLOAD
environment variable disables Puppeteer's built-in Chrome installation process that normally runs when you npm install
or npm ci
. Manually installing Chromium instead simplifies filesystem permission management and lets you manage your Chromium and Puppeteer versions independently. The PUPPETEER_EXECUTABLE_PATH
environment variable tells Puppeteer where to find the Chromium binary when this method is used.
ENV XDG_CACHE_HOME
and ENV XDG_CONFIG_HOME
Overriding these environment variables sets the directories where Chromium will store user data. These are changed from their defaults to avoid directory permission conflicts that can occur when running the process as a non-root user.
RUN apt-get update
and apt-get install
This instruction uses Debian's apt package manager to install Chromium and its key dependencies. The list of libraries is the minimum required to successfully launch Chromium and generate screenshots with Puppeteer. It includes key X11 display libraries, the D-Bus message bus system, and JPEG and PNG handling libraries.
COPY package.json
and npm ci
This step copies your npm package files from your working directory, then uses npm ci
to install your dependencies in the container. The step comes after the apt dependencies are installed because the npm dependencies are smaller and faster to download. This order ensures Docker's layer cache is used efficiently.
Copying Source Code
The remaining steps copy the app's source code, specify a non-root user for the container to run as (we'll discuss this in more detail below), and instruct Docker to run main.js
as the container's foreground process.
3. Creating a Docker Compose File
This step is optional, but it makes it easier to build and run your container image consistently. Save the following file as docker-compose.yml
:
services:
screenshotter:
image: urlbox-docker-screenshot-demo:latest
build:
context: .
dockerfile: Dockerfile
ports:
- 3000:3000
cap_add:
- SYS_ADMIN
volumes:
- ./captures:/captures
This Docker Compose file defines our image's name and specifies it should be built from Dockerfile
. The ports
section sets up a port binding from port 3000 on your host to the container's port 3000, allowing you to access the Express API on localhost:3000
.
The cap_add
section grants the SYS_ADMIN
capability to the container. This allows Chromium to successfully run as a non-root user with sandboxing enabled (we'll explain this in more detail below).
Under the volumes
section, a bind mount is configured between the captures
folder in your working directory and /captures
in the container. It enables the screenshots saved by the API to be viewed on your host under captures
. Because the Chromium process in the container runs as the non-root user 1001
, you must set appropriate permissions on the directory that allow other users to write to it.
$ mkdir -p captures
$ chmod 777 captures
4. Build the Container Image
Now we're ready to use Docker Compose to build and tag the container image:
$ docker compose build
The build output will be displayed in your terminal window. Wait until you see that the image has been built successfully before continuing. The build may take several minutes to complete while the Chromium and Puppeteer dependencies are installed.
5. Start a Container and Test Your Screenshot Service
It's now time to test the service!
Use docker compose up -d
to start your service as a background process on your Docker host. This will use the config in your docker-compose.yml
file to create a container, assign the SYS_ADMIN
capability, bind mount the captures
directory, and set up a port binding on port 3000.
$ docker compose up -d
[+] Running 2/2
✔ Network url001_default Created 0.0s
✔ Container url001-screenshotter-1 Started 0.3s
Once the container has started, try using curl to request localhost:3000/capture?url=https://www.google.com
. You should see that the screenshot is saved to your captures
directory after a few seconds.
curl 'localhost:3000/capture?url=https://www.google.com'
We've now successfully created a Docker container for this simple screenshot service! You can proceed to manage the container using standard Docker or Docker Compose commands, such as docker compose logs
to check the container's output:
$ docker compose logs
screenshotter-1 | Service is listening on port 3000...
Best Practices for Website Screenshot Service Docker Containers
The Docker image we created above is an effective starting point to containerize Puppeteer-based screenshot services. However, you should also note the following best practices to improve your service's performance, security, and scalability.
Run the Container as a Non-Root User
Docker defaults to running containers as the root
user. If an attacker successfully compromises the process running in the container, they may also be able to gain control of your Docker host. This is particularly important for Puppeteer applications because browsers require many capabilities and have a large attack surface. Use the Dockerfile USER
instruction—as shown in the image we created above—to mitigate this risk by ensuring your container processes run as a non-root user.
Keep Chrome's Linux Sandbox Enabled
It's crucial to keep Chrome's sandboxing protections enabled when you're running Puppeteer in a Docker container. Chrome uses Linux sandboxes to isolate web content, preventing malicious JavaScript from breaking out of the process. The sandbox requires your containers to run with the SYS_ADMIN
capability enabled, as shown above.
Many Docker and Puppeteer tutorials advise using the --no-sandbox
Puppeteer launch option. When this option is used, the container doesn't need extra capabilities and Chrome will successfully run as Docker's default root
user. However, the option completely disables sandboxing protections so it shouldn't be used in production. It's safer to use a non-root container user, assign the SYS_ADMIN
container capability, and keep sandboxing enabled.
Ensure Efficient Docker Image Layer Caching
Running the build for a Puppeteer Docker image can take several minutes. There's many large dependencies to download and install. To reduce waiting times, you should optimize the order of instructions in your Dockerfile to improve layer caching efficiency.
Docker creates a new layer for each instruction in your Dockerfile; those layers will be reused on the next build if none of the layers above them have changed. Content that infrequently produces changes—such as installing apt
and npm
dependencies—should therefore come before operations like copying in your source code.
In the following example, the copy operation will invalidate the layer cache each time your source code changes:
COPY package.json .
COPY src/ .
RUN npm ci
This causes npm ci
to run on every build, even though only your source code might have changed. Restructuring the Dockerfile as follows produces a more efficient build sequence:
COPY package.json .
RUN npm ci
COPY src/ .
Now Docker only has to run npm ci
if your package.json
file has changed.
Optimize Image Size Using Multi-Stage Builds
Docker images for Puppeteer-based screenshot services can quickly grow large. The image created in this article is roughly 1.9 GB, for instance. The Chromium binary and dependencies carry unavoidable weight, but taking advantage of Docker's multi-stage build features can help optimize your final output size. Running code compilation processes in a separate stage lets you discard any development tools that don't need to be retained in the final image, for example:
FROM node:22 AS build
COPY src/ .
COPY *.json .
RUN npm ci
RUN npm build --output-dir /build/app
FROM node:22
# (install Puppeteer + dependencies here)
COPY --from=build /build/app .
CMD ["app"]
This image will only include the Puppeteer dependencies and the compiled /build/app
output. The dependencies downloaded by the build
stage’s npm ci
command won’t be present in the final image.
Include Common Font Packages so Screenshots Render Correctly
The screenshots produced by a Dockerized Puppeteer instance may show incorrect fonts compared with what you see when you visit the website. Many websites rely on common system fonts, but these won't automatically be present inside your container. Adding key font packages like fonts-freefont-ttf
and ttf-mscorefonts-installer
to your Dockerfile will ensure common fonts or close alternatives are available.
RUN apt-get update && apt-get install -y fonts-freefont-ttf
Run Puppeteer as a Separate Microservice Alongside Your App
Puppeteer and the code that controls it is best operated as a standalone microservice alongside your app. This ensures you can scale the two components independently to improve high availability and performance.
Our demo image is a good example of what a standalone Puppeteer microservice could look like. It provides a simple API that your other microservices can call to capture a screenshot. Decoupling Puppeteer in this way also makes it easier to change between screenshot capture services if you need to. For instance, you could replace your custom Puppeteer solution with Urlbox’s ready-to-use capture API by updating the endpoint that your application code calls.
Summary
Website screenshot services provide a convenient interface to programmatically capture webpage images. They can help you archive user-generated content, scrape information, or analyze a page's visual changes over time.
Browser automation tools like Puppeteer provide the technical foundation for building a website screenshot service, but their complex dependencies mean it's often challenging to configure new environments. Creating a Docker image lets you operate your service reliably, whether in development, CI/CD, or production, without risking missing dependencies or conflicting versions. You can also use your Docker image to scale your service by deploying multiple replicas using a container orchestrator like Kubernetes.
The screenshot service and container image we've created in this guide is still a simplified example. A real service will usually need many more features to ensure quality results, such as:
- Full page screenshot support
- Proxying to circumvent regional content blocks
- Evasions to deal with captchas, scroll jacking, infinite scroll, and cookie banners
- GPU access to ensure WebGL and WebGPU sites render correctly
- Support for other output formats such as PDF, scrolling video, or direct S3 image uploads (typically requiring additional dependencies)
This list is just a starting point of what you might want to include. If it all seems too complicated—and we think it is—then check out Urlbox instead. Urlbox is a simple screenshot API that you can call from your code. We make it easy to generate high-quality website captures without having to manually configure or operate Puppeteer. Urlbox supports all the capabilities listed above and many more, ensuring your captures always render as expected. You can get started for free with a 7-day trial.