Buildel OCR

A TypeScript implementation using the ZeroX library for OCR and document extraction with vision models.

Prerequisites

- Node.js (v16 or higher)
- npm or pnpm
- OpenAI API key

Installation

1. Clone this repository
2. Install dependencies:

``bash npm install

`or using pnpm`


pnpm install


3. Set up your configuration:
$3
Copy the template file and edit it with your actual values:

`bash cp .env.template .env`

Then open the .env file and add your OpenAI API key and other configuration options.

`$3`

`bash

`On Linux/macOS`


export OPENAI_API_KEY=your_api_key_here
export AUTH_ENABLED=true
export AUTH_USERNAME=your_username
export AUTH_PASSWORD=your_password
On Windows (CMD)

set OPENAI_API_KEY=your_api_key_here
set AUTH_ENABLED=true
set AUTH_USERNAME=your_username
set AUTH_PASSWORD=your_password
On Windows (PowerShell)

$env:OPENAI_API_KEY="your_api_key_here"
$env:AUTH_ENABLED="true"
$env:AUTH_USERNAME="your_username"
$env:AUTH_PASSWORD="your_password"


Usage
To run the OCR example manually after setting up your .env file or environment variables:

`bash npm run start

`or using pnpm`


pnpm start


This will:
1. Process a sample PDF document (CS101.pdf) using the specified model
2. Extract text while maintaining the original formatting
3. Save the results to the configured output directory
4. Display a sample of the extracted content
API Authentication
The API includes basic authentication to secure your endpoints in production environments.
$3
Authentication is controlled by the following environment variables:

| Variable | Description | Default | | --------------- | -------------------------------------- | ------- | |AUTH_ENABLED | Set to 'true' to enable authentication | false| |AUTH_USERNAME| Username for basic authentication | None | |AUTH_PASSWORD | Password for basic authentication | None |

When authentication is enabled, all API endpoints under /api/* will require basic authentication.

`$3`

When authentication is enabled, you need to include the Authorization header with your requests:

`bash

`Using curl`


curl -X GET "http://localhost:3000/api/health" \
  -H "Authorization: Basic $(echo -n 'username:password' | base64)"
Using JavaScript fetch

fetch('http://localhost:3000/api/health', {
  headers: {
    'Authorization': 'Basic ' + btoa('username:password')
  }
})


Environment Variables

The following environment variables can be set in your .env file:

| Variable | Description | Default | | ----------------- | --------------------------------------- | ------------- | |OPENAI_API_KEY| Your OpenAI API key | (required) | |MODEL | The AI model to use | gpt-4o-mini| |OUTPUT_DIR | Directory to save results | ./output| |MAINTAIN_FORMAT | Whether to maintain document formatting | true| |CONCURRENCY | Number of concurrent processes | 5| |AUTH_ENABLED | Enable API authentication | false| |AUTH_USERNAME| Username for API authentication | None | |AUTH_PASSWORD | Password for API authentication | None |

`Customization`

You can modify the src/main.ts file to:

- Change the input document path - Use a different model - Adjust processing parameters - Process specific pages instead of the entire document - Change output options

`Configuration Options`

The ZeroX library supports various configuration options:

- filePath: Path or URL to the document to process -model: AI model to use for extraction -outputDir: Directory to save results -pagesToConvertAsImages: Page numbers to process (undefined for all) -maintainFormat: Whether to maintain document formatting -cleanup: Whether to clean up temporary files -concurrency: Number of concurrent processes to run -credentials: Authentication credentials for the AI provider (required)

`$3`

ZeroX supports multiple AI providers:

1. OpenAI (models like gpt-4o, gpt-4o-mini) 2. Google (Gemini models) 3. Azure OpenAI 4. AWS Bedrock (Claude models)

Each provider requires specific credentials. Refer to the ZeroX documentation for details.

`License`

ISC

`Document Processing CLI`

`Installation`

`bash npm install npm run build`

`Usage`

`$3`

`bash node dist/cli.js process --file document.pdf --output ./output`

`$3`

`bash node dist/cli.js process --file document.pdf --chunk --max-tokens 500 --overlap 50 --output ./output`

`$3`

#### OpenAI (Default)

`bash node dist/cli.js process --file document.pdf --chunk --embedding-provider openai --output ./output`

#### Azure OpenAI

`bash node dist/cli.js process --file document.pdf --chunk \ --embedding-provider azure \ --azure-api-key "your-azure-api-key" \ --azure-base-url "https://your-resource.openai.azure.com/openai/" \ --azure-api-version "2024-02-01" \ --azure-deployment "your-deployment-name" \ --output ./output`

`$3`

`bash node dist/cli.js process --file document.pdf \ --chunk --max-tokens 500 --overlap 50 \ --language "es" \ --embedding-provider azure \ --azure-api-key "your-azure-api-key" \ --azure-base-url "https://your-resource.openai.azure.com/openai/" \ --azure-deployment "your-deployment-name" \ --output ./output`

`CLI Options`

- --file : Path to the document file (required) ---extension : File extension override ---language : Target language for translation (ISO 639-1 code) ---output : Output directory (default: ./output) ---chunk: Enable document chunking ---max-tokens : Maximum tokens per chunk ---overlap : Overlap tokens between chunks ---existing-tags : Comma-separated list of existing tags ---embedding-model-provider : Embedding provider: 'openai' or 'azure' (default: 'openai') ---embedding-api-key : ... ---embedding-endpoint : ... ---embedding-deployment : ... ---llm-model-provider : Embedding provider: 'openai' or 'azure' (default: 'openai') ---llm-api-key: ... ---llm-endpoint: ... ---llm-deployment: ...

`Environment Variables`

- OPENAI_API_KEY: Required for OpenAI provider -MODEL`: OpenAI model to use (default: gpt-4o-mini)

Buildel OCR

A TypeScript implementation using the ZeroX library for OCR and document extraction with vision models.

Prerequisites

- Node.js (v16 or higher)
- npm or pnpm
- OpenAI API key

Installation

1. Clone this repository
2. Install dependencies:

``bash npm install

`or using pnpm`


pnpm install


3. Set up your configuration:
$3
Copy the template file and edit it with your actual values:

`bash cp .env.template .env`

Then open the .env file and add your OpenAI API key and other configuration options.

`$3`

`bash

`On Linux/macOS`


export OPENAI_API_KEY=your_api_key_here
export AUTH_ENABLED=true
export AUTH_USERNAME=your_username
export AUTH_PASSWORD=your_password
On Windows (CMD)

set OPENAI_API_KEY=your_api_key_here
set AUTH_ENABLED=true
set AUTH_USERNAME=your_username
set AUTH_PASSWORD=your_password
On Windows (PowerShell)

$env:OPENAI_API_KEY="your_api_key_here"
$env:AUTH_ENABLED="true"
$env:AUTH_USERNAME="your_username"
$env:AUTH_PASSWORD="your_password"


Usage
To run the OCR example manually after setting up your .env file or environment variables:

`bash npm run start

`or using pnpm`


pnpm start


This will:
1. Process a sample PDF document (CS101.pdf) using the specified model
2. Extract text while maintaining the original formatting
3. Save the results to the configured output directory
4. Display a sample of the extracted content
API Authentication
The API includes basic authentication to secure your endpoints in production environments.
$3
Authentication is controlled by the following environment variables:

When authentication is enabled, all API endpoints under /api/* will require basic authentication.

`$3`

When authentication is enabled, you need to include the Authorization header with your requests:

`bash

`Using curl`


curl -X GET "http://localhost:3000/api/health" \
  -H "Authorization: Basic $(echo -n 'username:password' | base64)"
Using JavaScript fetch

fetch('http://localhost:3000/api/health', {
  headers: {
    'Authorization': 'Basic ' + btoa('username:password')
  }
})


Environment Variables

The following environment variables can be set in your .env file:

`Customization`

You can modify the src/main.ts file to:

- Change the input document path - Use a different model - Adjust processing parameters - Process specific pages instead of the entire document - Change output options

`Configuration Options`

The ZeroX library supports various configuration options:

`$3`

ZeroX supports multiple AI providers:

1. OpenAI (models like gpt-4o, gpt-4o-mini) 2. Google (Gemini models) 3. Azure OpenAI 4. AWS Bedrock (Claude models)

Each provider requires specific credentials. Refer to the ZeroX documentation for details.

`License`

ISC

`Document Processing CLI`

`Installation`

`bash npm install npm run build`

`Usage`

`$3`

`bash node dist/cli.js process --file document.pdf --output ./output`

`$3`

`bash node dist/cli.js process --file document.pdf --chunk --max-tokens 500 --overlap 50 --output ./output`

`$3`

#### OpenAI (Default)

`bash node dist/cli.js process --file document.pdf --chunk --embedding-provider openai --output ./output`

#### Azure OpenAI

`$3`

`CLI Options`

`Environment Variables`

- OPENAI_API_KEY: Required for OpenAI provider -MODEL`: OpenAI model to use (default: gpt-4o-mini)

@buildel/ocr

Buildel OCR

Prerequisites

Installation

or using pnpm

$3

$3

On Linux/macOS

On Windows (CMD)

On Windows (PowerShell)

Usage

To run the OCR example manually after setting up your .env file or environment variables:

or using pnpm

API Authentication

$3

$3

Using curl

Using JavaScript fetch

Environment Variables

Customization

Configuration Options

$3

License

Document Processing CLI

Installation

Usage

$3

$3

$3

$3

CLI Options

Environment Variables

@buildel/ocr

Buildel OCR

Prerequisites

Installation

or using pnpm

$3

$3

On Linux/macOS

On Windows (CMD)

On Windows (PowerShell)

Usage

To run the OCR example manually after setting up your .env file or environment variables:

or using pnpm

API Authentication

$3

$3

Using curl

Using JavaScript fetch

Environment Variables

Customization

Configuration Options

$3

License

Document Processing CLI

Installation

Usage

$3

$3

$3

$3

CLI Options

Environment Variables

`or using pnpm`

`$3`

`On Linux/macOS`

`or using pnpm`

`$3`

`Using curl`

`Customization`

`Configuration Options`

`$3`

`License`

`Document Processing CLI`

`Installation`

`Usage`

`$3`

`$3`

`$3`

`$3`

`CLI Options`

`Environment Variables`

`or using pnpm`

`$3`

`On Linux/macOS`

`or using pnpm`

`$3`

`Using curl`

`Customization`

`Configuration Options`

`$3`

`License`

`Document Processing CLI`

`Installation`

`Usage`

`$3`

`$3`

`$3`

`$3`

`CLI Options`

`Environment Variables`