Open the tool now — free, no signup, no watermark.
The data you need is sitting right there in a PDF — quarterly numbers, supplier prices, transaction history. The temptation is to copy-paste into Excel. The result is always heartbreaking: every row collapses into a single cell, columns merge, and you end up retyping anyway.
Why copy-paste fails
PDFs store text as positioned glyphs, not as a structured grid. Your viewer reads top-to-bottom and left-to-right; copy-paste doesn’t know “column A” exists. A real extractor reconstructs the grid from visual cues — alignment, spacing, ruler lines, alternating backgrounds — and writes each cell to its own .xlsx cell.
What good extraction handles
- Bordered tables: trivial, near-perfect output.
- Borderless tables with consistent alignment: almost always works; quick check on column boundaries.
- Multi-page tables: stitched together when row format is consistent.
- Merged cells: represented as merged ranges in the .xlsx, not silently broken.
- Number / currency formatting: detected and applied; totals work in Excel without manual conversion.
What still needs hands
- Tables where rows split unpredictably across pages.
- Two-column documents where each column has its own table.
- Scanned tables — run OCR first, then extract.
Output choices: CSV vs XLSX
CSV is universal and pipe-friendly — drop into any database, BI tool or script. Loses formatting, formulas, and merged cells.
XLSX keeps formatting and structure. Use this when you’re continuing analysis in Excel.
For repeated jobs (e.g., monthly financial PDF you always export to a working spreadsheet), save the extraction settings as a workspace so the next run is one click.
Frequently asked questions
My PDF has 200 small tables. Will it extract all of them?
Yes. Each detected table becomes a sheet (XLSX) or a separate file (CSV). Auto-detection runs across the whole document.
Are formulas recovered?
No — formulas don't exist in the PDF. Only the displayed values are extracted. Rebuild formulas if you need them.