A beautiful, 100% offline desktop app for extracting text from PDF files. Available for macOS and Windows.
Save your token usage in LLMs by converting your PDFs to .txt files. You can also combine it with my document anonymization tool for 100% anonymity on LLMs (and EU GDPR compliance).
Grab the build for your OS from the latest Release:
- Download
PDFTextExtractor.zipand unzip it (Finder unzips automatically on double-click). - Drag PDFTextExtractor.app into your Applications folder.
- The very first time you launch it: right-click the app → Open, then click Open in the dialog. (macOS shows that warning for any app that isn't notarized by Apple. It's a one-time click — after that it opens normally from Launchpad/Spotlight.)
Requires macOS 11 (Big Sur) or newer, Apple Silicon or Intel.
- Download
PDFTextExtractor.exe. - Double-click it.
- The first time, Windows SmartScreen may show "Windows protected your PC" — click More info → Run anyway. (One-time bypass for any unsigned app.)
Requires Windows 10 (1809) or newer, x64.
That's it on either platform. No Python, no Terminal, no pip install. The app is fully self-contained and runs entirely offline — your PDFs never leave your computer.
| Feature | Description |
|---|---|
| Drag & Drop | Drop PDFs directly onto the window |
| Open dialog | ⌘O — open one or many PDFs at once |
| Multi-file sidebar | Switch between docs instantly |
| Page markers | Text is split by page for easy navigation |
| Search | ⌘F — highlights all matches in yellow |
| Copy text | ⇧⌘C — copy all extracted text to clipboard |
| Save as .txt | ⌘S — save extracted text as a plain-text file |
| Remove file | ⌘W — remove current doc from the list |
| Show in Finder | Right-click any sidebar item |
| Dark mode | Automatically follows system appearance |
| Background loading | Large PDFs load without freezing the UI |
| Shortcut | Action |
|---|---|
| ⌘O | Open PDF(s) |
| ⌘F | Show search bar |
| Escape | Close search bar |
| ⇧⌘C | Copy extracted text |
| ⌘S | Save as .txt |
| ⌘W | Remove current file |
The same pdf_text_extractor.py source compiles to both platforms. You only need this section if you're rebuilding the binary yourself.
cd build_app
./build_app.shOutputs dist/PDFTextExtractor.app (~120 MB). First run takes ~1–2 minutes (downloads PyQt6 + PyMuPDF + PyInstaller into build_app/.venv/); rebuilds take ~20 seconds. Requires macOS 11+ with Command Line Tools and Homebrew Python 3.10–3.12.
On a Windows machine:
cd windows
build_app.batOutputs windows\dist\PDFTextExtractor.exe (single self-contained file, ~80 MB). Requires Python 3.10–3.12 from python.org with "Add python.exe to PATH" checked. See windows/README.md for details.
Don't have a Windows machine? Push a tag like v2.1.0 and the build-windows.yml GitHub Actions workflow builds the .exe on a Windows runner and attaches it to the matching Release automatically.
macOS:
chmod +x run.sh
./run.shWindows:
pip install -r requirements.txt
python pdf_text_extractor.pyThis installs PyMuPDF + PyQt6 into your active Python env and launches the script directly — handy while editing pdf_text_extractor.py.
The SwiftUI_Xcode/ folder contains a SwiftUI + PDFKit version of the app. PDFKit ships with macOS, so this version has zero runtime dependencies and the resulting binary is only ~2–5 MB.
To build it:
- Open Xcode → File → New → Project → macOS → App
- Name it
PDFTextExtractor, set Interface to SwiftUI, Language to Swift - Delete the auto-generated
ContentView.swift - Drag all four
.swiftfiles fromSwiftUI_Xcode/into the project - Press ⌘R
Requires Xcode 14+ and macOS 13+.
PDF text extraction uses PyMuPDF (the fitz library) — one of the fastest and most accurate PDF parsing libraries available. It pulls the actual text layer embedded in the PDF, so no OCR is needed for normal digital PDFs. Scanned-only PDFs without an embedded text layer will show "(No extractable text found)".
The SwiftUI version uses Apple's PDFKit framework, which ships with macOS and does the same thing natively.
The release pipeline is intentionally boring. To cut a new release covering both platforms:
-
Update the version in
build_app/PDFTextExtractor.spec,windows/version_info.txt, andCHANGELOG.md. -
Build and zip the macOS app:
cd build_app && ./build_app.sh cd ../dist && zip -ry PDFTextExtractor.zip PDFTextExtractor.app
-
Tag and push:
git tag -a v2.1.1 -m "Release v2.1.1" git push origin v2.1.1 -
GitHub Actions automatically builds
PDFTextExtractor.exeon a Windows runner and attaches it to the v2.1.1 Release. -
Upload the macOS zip to the same Release:
gh release upload v2.1.1 dist/PDFTextExtractor.zip
Clients then grab whichever file matches their OS from the Releases page.
Want to skip the SmartScreen / right-click→Open step on clients' machines? That requires code signing — Apple notarization for macOS (
$99/yr Developer account) and an OV/EV code-signing certificate for Windows ($200–400/yr). Both build scripts have clearly marked spots to plug those in.
ChatReadyPDF/
├── pdf_text_extractor.py ← Shared cross-platform app source
├── requirements.txt ← PyMuPDF + PyQt6
├── run.sh ← Run from source on macOS/Linux
├── AppIcon.png ← Source icon (used by both platforms)
├── ui_preview.png ← Screenshot for the README
├── generate_assets.py ← Helper for regenerating screenshots
│
├── build_app/ ← macOS build
│ ├── PDFTextExtractor.spec ← PyInstaller config (.app bundle)
│ ├── build_app.sh ← One-command builder
│ ├── AppIcon.icns ← Generated by build_app.sh (gitignored)
│ └── .venv/ ← Build venv (gitignored)
│
├── windows/ ← Windows build
│ ├── PDFTextExtractor_win.spec ← PyInstaller config (single .exe)
│ ├── version_info.txt ← VERSIONINFO resource for the .exe
│ ├── build_app.bat ← One-command builder
│ ├── AppIcon.ico ← Multi-resolution Windows icon
│ └── README.md ← Windows-specific build notes
│
├── .github/workflows/
│ └── build-windows.yml ← Auto-builds .exe on tag push
│
├── dist/ ← macOS build output (gitignored)
│ └── PDFTextExtractor.app
│
└── SwiftUI_Xcode/ ← Alternative native macOS version
├── PDFTextExtractorApp.swift
├── DocumentStore.swift
├── ContentView.swift
├── SidebarView.swift
└── TextDetailView.swift
MIT. See LICENSE.
