How to Extract Images from PDF: 5 Practical Methods Compared

Pulling images out of a PDF comes up more often than people expect. You want a chart from a report dropped into a deck. You want the logo a designer sent inside a layout PDF, on its own. You want a clean copy of a receipt scan you can resend at print quality. The output may look the same, but the method you pick changes the quality, the time it takes, and whether the text inside stays searchable.

This guide compares five methods that actually get used in real workflows. For each one I'll show when it's the right pick, the trade-offs, and — wherever possible — commands you can paste and run today.

1. Convert each page to PNG

This is the fastest and safest default. Rendering a whole PDF page to a PNG keeps the original layout — text, charts, callouts — intact, so even if the page mixes several embedded images, you get one clean image per page that drops straight into a slide deck, Notion, or a mobile share.

Our web converter handles this in the browser with no sign-up: drag a PDF up to 100MB, render at up to 2000px on the long edge, then grab pages individually or as a single ZIP.

Use it when

You want multiple pages as images in one go
You need the layout (charts, callouts, marginalia) preserved as-is
You don't want to install anything

Trade-offs

You can't pull out a single embedded image — only the whole page
Once rendered as PNG, the text is no longer selectable

2. Extract embedded images with pdfimages

Most PDFs carry the original raster image files inside them, separate from the page layout. pdfimages dumps those originals straight out, so you get back the source files without any re-compression — the highest possible quality.

# macOS: brew install poppler / Ubuntu: apt install poppler-utils
pdfimages -all -j document.pdf out

# Outputs out-000.jpg, out-001.png, ...
# Or limit to a page range:
pdfimages -all -f 3 -l 5 document.pdf out

Use it when

You need logos, photos, or illustrations back as standalone assets
You're going to print or re-edit and quality matters most
You're pulling dozens or hundreds of images from one PDF

Trade-offs

Vectors (SVG, drawn shapes) are skipped — raster images only
An image stored in tiles will come out as tiles

3. Screenshot the page

For one or two images you just need now, a screenshot is unbeatable. Use ⌘ + Shift + 4 on macOS or Win + Shift + S on Windows to drag a selection and save.

Use it when

One to three images and you're pasting them into a chat immediately
You want PDF reader overlays (highlights, comments) included

Trade-offs

Quality is locked to your screen DPI — usually softer than direct page-to-PNG
Anything past ten pages becomes painful

4. Script it with Python pdf2image or the API

If PDFs land in your workflow every day, automate. Python's pdf2image turns a PDF into a list of PNGs in a few lines.

# pip install pdf2image (and install poppler)
from pdf2image import convert_from_path

images = convert_from_path("report.pdf", dpi=300)
for i, img in enumerate(images):
    img.save(f"page-{i+1:03d}.png", "PNG")

If you're on Node.js, or you want to avoid native dependencies and run from a cloud function, call our developer API instead — the API automation guide walks through full Node.js and Python examples.

5. Run OCR to keep the text

Scanned PDFs, handwritten notes, and receipts already are images — extracting them with any of the above methods leaves you with images you can't search. OCR turns them back into PNGs plus a searchable text layer.

# brew install tesseract
# First convert the page to PNG (methods 1 or 4)
tesseract page-001.png page-001 -l eng+kor

# Produces page-001.txt alongside the PNG

OCR is slower and accuracy depends heavily on font and resolution — especially for Korean. The biggest single improvement is rendering the source page at 300 DPI or higher before passing it to tesseract.

Side-by-side

Method	Quality	Batch	Setup
1. Page to PNG	High	Great	None (web)
2. pdfimages	Source-grade	Great	poppler
3. Screenshot	Medium	Poor	None
4. Script / API	High	Great	Python / Node
5. OCR	High + text	Good	tesseract

Bottom line

One sentence per case: ad-hoc and one-shot — use the web converter; source images for design or print — use pdfimages; daily pipeline — use the API or pdf2image; searchable text required — add OCR on top. These aren't exclusive either: rendering pages to PNG first and then OCR-ing them is the single most common production workflow.

Try the web converter

Drag a PDF, get per-page PNGs or a single ZIP. No sign-up, up to 2000px, files deleted after conversion.

Convert Now