Taking Screenshots of All Pages on a Website
Learn how to capture screenshots of every webpage on a website using sitemap extractors
A common question we receive is: "Are you able to take screenshots of all webpages on a website in one request?"
While Urlbox doesn't natively support bulk website captures in a single request, you can easily achieve this using some automation, or CaptureDeck and a sitemap Extractor. This guide will walk you through the process of capturing screenshots of every page on a website, with both code and no-code solutions.
Overview
The process involves two main steps:
- Extract all URLs from the website using its sitemap
- Capture screenshots of all URLs using CaptureDeck (for no-code) or your own automation with Urlbox
Step 1: Getting the List of URLs
Most websites publish a sitemap in XML format that contains all their pages. This sitemap is typically available at /sitemap.xml on the website's domain. Here is ours.
Finding the Sitemap
Companies often place a link to their sitemap in the footer of their main webpage.
Common sitemap locations include:
https://example.com/sitemap.xmlhttps://example.com/sitemap_index.xmlhttps://example.com/sitemaps.xml
For example, OpenAI's sitemap is available at: https://openai.com/sitemap.xml
Extracting URLs from the Sitemap
To convert the XML sitemap into a list of all of the website's URLs, you can use an online tool or do it programmatically.
Using an Online Tool
SEOwl Sitemap Extractor is a free tool that extracts URLs from sitemaps. Paste the sitemap URL into the tool, and it will generate a complete list of all pages on the website.
Using Node.js (Programmatic)
For automation or integration into your workflow, we recommend using the sitemapper package:
npm install sitemapperimport Sitemapper from 'sitemapper';
const sitemap = new Sitemapper({
url: 'https://example.com/sitemap.xml',
timeout: 10000,
});
const { sites } = await sitemap.fetch();
console.log(sites);
// ['https://example.com/', 'https://example.com/about', 'https://example.com/contact', ...]The sitemapper package handles nested sitemaps (sitemap indexes) automatically, so you don't need to worry about parsing multiple sitemap files. You could extend this further by accepting just a website URL and then trying to fetch known sitemap locations until you find it, then pass it into the site-mapper.
Step 2: Capturing Screenshots
Once you have your list of URLs, you have several options for capturing screenshots:
Option 1: CaptureDeck (No-Code Solution)
CaptureDeck is a no-code tool built on top of Urlbox that's perfect for bulk screenshot captures. It's the fastest way to get screenshots of all the webpages on a website without writing your own code.
Steps:
- Sign up for a CaptureDeck account
- Create a new "Deck"
- Paste your list of URLs into the deck
- Run the capture
CaptureDeck will process all URLs and provide you with organised screenshots that you can view in the dashboard or download as a ZIP file.
CaptureDeck also allows you to create your own 'presets'. These are combinations of options that aim to take a particular type of screenshot. We have preconfigured social media presets, full page presets, or mobile presets. You can find them in your CaptureDeck team settings.
Option 2: Urlbox API with Custom Script
For more control or integration into existing workflows, you can use the Urlbox API directly. Here's a simple example in JavaScript to process multiple URLs:
const urls = [
'https://example.com/page1',
'https://example.com/page2',
'https://example.com/page3'
// ... your full list of URLs
];
const URLBOX_SECRET = 'your-urlbox-secret';
async function captureScreenshots(urls) {
return Promise.all(
urls.map(async (url) => {
const response = await fetch('https://api.urlbox.com/v1/render/sync', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${URLBOX_SECRET}`
},
body: JSON.stringify({
url,
full_page: true,
format: 'png'
})
});
const data = await response.json();
return { url, screenshot: data };
})
);
}
captureScreenshots(urls)
.then(results => {
console.log(`Captured ${results.length} screenshots`);
// Process your results here
})
.catch(error => {
console.error('Error capturing screenshots:', error);
});Best Practices
Rate Limiting
When processing large numbers of URLs, be mindful of:
- Rate Limits - For very large sites (1000+ pages), consider spacing out your requests to avoid rate limiting
- Target site politeness - Don't overwhelm the target website with too many concurrent requests
Handling Large Sites
For websites with thousands of pages:
- Batch processing - Process URLs in batches to avoid overwhelming your system
- Storage - Consider using S3 or similar cloud storage for organising large numbers of screenshots
Troubleshooting
Sitemap not found?
- Check
/robots.txtfile which often lists the sitemap location - Look for sitemap references in the website's footer or help pages
This method might not always work, as some websites don't include a sitemap.
Getting Help
If you run into any issues or need help processing a particularly large or complex website, don't hesitate to contact our support team. We're happy to help you diagnose your setup and optimise your bulk screenshot workflow.