f2md

Convert PDF, DOCX, and image files to Markdown using AI. This CLI tool extracts text, images, and preserves table structure while converting documents to clean, well-formatted Markdown. It also supports OCR text extraction from images.

Features

- PDF Support - Full text extraction, image extraction, and page screenshots for layout understanding
- DOCX Support - Text and image extraction with structure preservation
- Image OCR - Extract text from images (PNG, JPG, JPEG, GIF, WEBP) using AI-powered OCR
- AI-Powered Conversion - Uses Google's Gemini AI to intelligently convert content to Markdown
- Interactive CLI - Friendly prompts using clack.js
- Easy Setup - Built-in configuration wizard for API keys

Installation

$3

``bash npx f2md document.pdf`

`$3`

`bash bunx f2md document.pdf`

`$3`

`bash pnpm dlx f2md document.pdf`

`$3`

`bash npm install -g f2md

`or`


bun install -g f2md


Setup
Before using the tool, you need to configure your Google AI API key.
$3

`bash f2md setup

`or with npx`


npx f2md setup


The setup wizard will:
1. Show you where to get a Google AI API key (https://aistudio.google.com/apikey)
2. Prompt you to enter your API key
3. Ask where to save it (local project or global for all projects)
$3
Alternatively, set the environment variable:

`bash export GOOGLE_GENERATIVE_AI_API_KEY="your-api-key-here"`

Or create a .env file in your project:

`GOOGLE_GENERATIVE_AI_API_KEY=your-api-key-here`

`Usage`

`$3`

`bash f2md`

The tool will prompt you for:

- Input file path (PDF, DOCX, or image) - Output file path

`$3`

`bash

`Convert with auto-generated output name`


f2md document.pdf
Convert with custom output path

f2md document.pdf output.md
Extract text from an image (OCR)

f2md screenshot.png
Extract text from image with custom output

f2md image.jpg output.md

$3

- PDF (.pdf) - Word Documents (.docx) - Images (.png, .jpg, .jpeg, .gif, .webp) - OCR text extraction

`Options`

`bash f2md --help # Show help f2md --version # Show version f2md setup # Configure API key`

`How It Works`

`$3`

1. Extraction - Reads the input file and extracts text, images, and layout information 2. Processing - For PDFs, captures page screenshots to understand visual layout 3. AI Conversion - Sends extracted content to Google's Gemini AI model 4. Markdown Generation - Receives AI-generated Markdown with proper formatting 5. Cleanup - Removes unused images and saves the final output

`$3`

1. Image Processing - Reads the image file and encodes it for AI processing 2. OCR Analysis - Sends the image to Google's Gemini AI with specialized prompts for text extraction 3. Text Extraction - AI extracts all visible text while preserving structure (headings, lists, tables) 4. Markdown Generation - Converts extracted content to well-formatted Markdown 5. Output - Saves the final Markdown file

`Development`

`$3`

- Bun installed

`$3`

`bash

`Clone the repository`


git clone 
cd f2md
Install dependencies

bun install
Run in development mode

bun run dev

$3

`bash bun run build`

`$3`

`src/ cli.ts - CLI entry point with clack prompts convert.ts - Core conversion logic index.ts - Public API exports dist/ - Built output (generated)`

`API Usage`

You can also use this as a library in your Node.js/Bun projects:

`typescript import { convert } from "f2md";

const result = await convert("input.pdf", "output.md", { onProgress: (message) => console.log(message), respectPages: false, });

console.log(Saved to: ${result.outputPath}); console.log(Images saved: ${result.imagesSaved}); console.log(Images cleaned: ${result.imagesDeleted});``

License

MIT

f2md

Features

Installation

$3

``bash npx f2md document.pdf`

`$3`

`bash bunx f2md document.pdf`

`$3`

`bash pnpm dlx f2md document.pdf`

`$3`

`bash npm install -g f2md

`or`


bun install -g f2md


Setup
Before using the tool, you need to configure your Google AI API key.
$3

`bash f2md setup

`or with npx`


npx f2md setup


The setup wizard will:
1. Show you where to get a Google AI API key (https://aistudio.google.com/apikey)
2. Prompt you to enter your API key
3. Ask where to save it (local project or global for all projects)
$3
Alternatively, set the environment variable:

`bash export GOOGLE_GENERATIVE_AI_API_KEY="your-api-key-here"`

Or create a .env file in your project:

`GOOGLE_GENERATIVE_AI_API_KEY=your-api-key-here`

`Usage`

`$3`

`bash f2md`

The tool will prompt you for:

- Input file path (PDF, DOCX, or image) - Output file path

`$3`

`bash

`Convert with auto-generated output name`


f2md document.pdf
Convert with custom output path

f2md document.pdf output.md
Extract text from an image (OCR)

f2md screenshot.png
Extract text from image with custom output

f2md image.jpg output.md

$3

- PDF (.pdf) - Word Documents (.docx) - Images (.png, .jpg, .jpeg, .gif, .webp) - OCR text extraction

`Options`

`bash f2md --help # Show help f2md --version # Show version f2md setup # Configure API key`

`How It Works`

`$3`

`Development`

`$3`

- Bun installed

`$3`

`bash

`Clone the repository`


git clone 
cd f2md
Install dependencies

bun install
Run in development mode

bun run dev

$3

`bash bun run build`

`$3`

`src/ cli.ts - CLI entry point with clack prompts convert.ts - Core conversion logic index.ts - Public API exports dist/ - Built output (generated)`

`API Usage`

You can also use this as a library in your Node.js/Bun projects:

`typescript import { convert } from "f2md";

const result = await convert("input.pdf", "output.md", { onProgress: (message) => console.log(message), respectPages: false, });

console.log(Saved to: ${result.outputPath}); console.log(Images saved: ${result.imagesSaved}); console.log(Images cleaned: ${result.imagesDeleted});``

License

MIT