← Great Apps
File2Text

File2Text

Markdown Converter

Convert PDF, DOCX, PPTX, EPUB, MOBI, XLSX, images, and 50+ formats to Markdown or plain text on Mac. Built-in OCR, Watch Folder, Finder Quick Action, fully offline.

100% Offline No Data Collection Free Tier Mac App Store
Download on the Mac App Store
Utilities macOS

Privacy & Security

All data processing happens locally on your device. No uploads, no tracking, no accounts required.

Fully offline processing

Text extraction and OCR run entirely on your Mac. Documents are never uploaded to external servers or cloud services.

No account or registration

Open the app and start converting immediately. No sign-up, no email verification, no login required.

No analytics or tracking

File2Text contains no telemetry, ad SDKs, or behavioral tracking. Your usage data stays private.

Complete data control

Source files, converted output, and any intermediate data remain on your device under your full control.

Features

Smart Document Detection
Recognizes agreements, invoices, bank statements, reports, and manuals — applies tailor-made extraction rules.
Hybrid Text Extraction
Pulls embedded text where possible, runs high-accuracy OCR on scanned pages, and analyses page geometry to keep headings, lists, tables, and columns intact.
Advanced Formatting Engine
Generates proper Markdown headings, bullet/numbered lists, code blocks, emphasis, and converts complex tables — even financial ones — into clean Markdown tables.
Batch Power
Drag-and-drop whole folders or mix file types in one go for maximum productivity.
Watch Folder
Pick a folder and every new file dropped into it is converted automatically — hands-free.
Finder Quick Action
Right-click any supported file in Finder and choose Convert to Markdown — no need to open the app.
eBook Support
Convert EPUB, MOBI, AZW, and AZW3 eBooks into editable Markdown drafts.
Presentation Support
Convert PPT and PPTX presentations into structured Markdown with slide content preserved.
Private by Design
All processing happens 100% locally. No uploads, no tracking, no accounts, no internet required.

How It Works

1

Step 1

Drop files or entire folders into the app, use Watch Folder for auto-conversion, or right-click in Finder with Quick Action

2

Step 2

Choose Markdown or plain text output and configure extraction options like OCR and table handling

3

Step 3

The engine detects document type, extracts embedded text or runs OCR, and applies structure-aware formatting

4

Step 4

Export cleaned, well-structured output ready for docs sites, Git repos, note-taking apps, or AI workflows

Use Cases

Prepare documents for AI and LLM pipelines
Convert PDFs, Word documents, and scanned images into structured Markdown that large language models can ingest cleanly, reducing token waste and improving retrieval accuracy in RAG systems.
Migrate legacy documents to Markdown repositories
Transform libraries of Word files, RTFs, and PDFs into version-controlled Markdown so technical teams can maintain documentation alongside code in Git.
Extract text from scanned documents and images
Use the built-in OCR engine to pull text from scanned receipts, contracts, reports, and photographs without relying on an external OCR service.
Convert eBooks to editable Markdown
Turn EPUB, MOBI, and AZW eBooks into Markdown drafts for editing, annotation, or republishing in different formats.
Extract content from presentations
Convert PPT and PPTX slide decks into structured Markdown for documentation, meeting notes, or content repurposing.

Compatible Sources & Providers

Works with all major email clients, cloud services, and data sources.

PDF documents

Extract text from native and scanned PDFs with automatic OCR fallback, preserving headings, tables, and page structure in the Markdown output.

Microsoft Office files

Convert DOC, DOCX, and XLSX files into Markdown while retaining heading hierarchy, list formatting, and table layouts.

Images and scanned pages

Process PNG, JPG, TIFF, HEIC, and other image formats through the OCR engine to produce searchable, editable text.

Structured data formats

Handle JSON, XML, YAML, PLIST, and CSV files by converting them into readable Markdown representations or clean plain text.

Static-site generators and note apps

Output Markdown that is directly compatible with Hugo, Jekyll, Gatsby, Obsidian, Notion imports, and other Markdown-native tools.

AI and LLM toolchains

Produce plain text and Markdown suitable for embedding pipelines, retrieval-augmented generation, fine-tuning datasets, and prompt context windows.

Supported Formats

Documents

PDF, DOC/DOCX, RTF/RTFD, TXT, MD, HTML/XHTML

Presentations

PPT/PPTX

eBooks

EPUB, MOBI, AZW/AZW3

Spreadsheets

CSV, TSV, XLSX

Images (with OCR)

PNG, JPG/JPEG, TIFF/TIF, HEIC/HEIF, BMP, GIF, WEBP, and more

Email & Contacts

EML, VCF, ICS

Data Files

XML, JSON, YAML/YML, PLIST, SQL

Configuration & Logs

INI, CFG, CONF, PROPERTIES, LOG

How It Compares

Pandoc

Typical use

The go-to command-line tool for document format conversion among technical users. Supports dozens of markup formats and is highly extensible through Lua filters and custom templates. Requires terminal familiarity and often needs additional dependencies like LaTeX for PDF output.

Great Apps advantage

File2Text provides a visual Mac interface with built-in OCR, smart document detection, and batch drag-and-drop—no terminal, no dependencies, and no template configuration required.

MarkItDown (Microsoft)

Typical use

An open-source Python library and CLI tool from Microsoft for converting Office documents, PDFs, and images to Markdown. Designed for LLM preprocessing pipelines. Requires Python 3.10+ and pip installation.

Great Apps advantage

File2Text is a standalone Mac app with zero setup. It covers more file formats out of the box, includes OCR for scanned documents, and offers batch folder processing through a native interface.

Online OCR and conversion tools

Typical use

Browser-based services like pdf2md, Mathpix, and various OCR sites offer quick one-off conversions. Convenient for single files but require uploading documents to third-party servers.

Great Apps advantage

File2Text processes everything locally on your Mac, supports batch operations across mixed file types, and works offline—essential for confidential documents and recurring workflows.

What Users Say

★★★★★

“We converted a mixed archive of 500+ PDFs and scanned documents into Markdown for our internal knowledge base in a single afternoon. The OCR quality was better than the online tools we had been using.”

Technical Documentation Lead
★★★★★

“File2Text became a key part of our RAG pipeline. The structured Markdown output significantly reduced the preprocessing we had to do before embedding documents for retrieval.”

ML Engineering Manager
★★★★★

“As a freelance technical writer, I constantly receive content in Word and PDF format. This app lets me convert everything to Markdown so I can work in my preferred editor and commit to Git.”

Freelance Technical Writer

Frequently Asked Questions

What file formats does File2Text support?

File2Text handles over 50 formats including PDF, DOC/DOCX, PPT/PPTX, EPUB, MOBI, AZW, RTF, XLSX, CSV, TSV, HTML, PNG, JPG, TIFF, HEIC, BMP, GIF, WEBP, EML, VCF, ICS, XML, JSON, YAML, PLIST, SQL, INI, LOG, and more.

Can File2Text convert eBooks to Markdown?

Yes. EPUB, MOBI, AZW, and AZW3 eBooks are fully supported. The app preserves chapter structure and formatting in the Markdown output.

Can File2Text convert presentations?

Yes. PPT and PPTX files are converted to structured Markdown with slide content preserved.

Can File2Text extract text from scanned documents and images?

Yes. The app includes a built-in OCR engine that processes scanned PDFs and image files. It automatically detects when a page is image-based and switches to OCR extraction.

What is Watch Folder?

Watch Folder lets you pick a directory and every new file dropped into it is automatically converted to Markdown — completely hands-free.

What is the Finder Quick Action?

Right-click any supported file in Finder and choose Convert to Markdown. The file is converted without needing to open the app.

How does File2Text compare to Pandoc?

Pandoc is a powerful CLI tool for technical users. File2Text offers a visual Mac interface, built-in OCR, smart document detection, Watch Folder, Finder Quick Action, and batch drag-and-drop without requiring terminal usage or dependencies.

Is File2Text suitable for AI and LLM data preparation?

Yes. Many users feed File2Text output into embedding pipelines, RAG systems, and fine-tuning workflows. The structured Markdown preserves headings, tables, and lists, which improves downstream processing quality.

Does File2Text require an internet connection?

No. All processing, including OCR, happens locally on your Mac. You can use the app in airplane mode or on air-gapped machines.

Is my document data sent to any external service?

No. File2Text processes everything on-device. There are no cloud uploads, no third-party APIs, and no data transmission of any kind.

Ready to get started?

Download File2Text from the Mac App Store.

Download on the Mac App Store