Extract text from PDF
You need just the text from a PDF — no layout, no images, just clean copy you can paste into another document, search, or feed into a script.
Why this works
PDFRun extracts the full text content as a plain .txt or .md file. Text-based PDFs come through instantly; scanned PDFs run through OCR automatically when needed.
How it works
-
1Open the extract toolTap the orange button above. Output defaults to plain .txt.
-
2Upload your PDFDrop the file in. We auto-detect whether OCR is needed.
-
3Choose output formatPlain text, Markdown (with headings preserved), or JSON for downstream parsing.
-
4Download the textSave and use anywhere — no PDF reader required.
Real-world uses
Researchers
Quote and cite text from papers without retyping.
Translators
Get clean source text for CAT tools and TM matching.
Developers
Pipe PDF content into scripts, search indexes or RAG systems.
Common questions
Will column order be preserved?
Yes — we follow reading order so two-column layouts come out as logical paragraphs, not interleaved gibberish.
What about scanned PDFs?
OCR is applied automatically. Pick the source language for best accuracy.