ytranscript

![npm version](https://www.npmjs.com/package/@nadimtuhin/ytranscript)
![npm downloads](https://www.npmjs.com/package/@nadimtuhin/ytranscript)
![CI](https://github.com/nadimtuhin/ytranscript/actions/workflows/ci.yml)
![License: MIT](https://opensource.org/licenses/MIT)

Extract transcripts from your entire YouTube watch history in minutes. Build AI-powered video summaries, searchable archives, or feed transcripts directly to Claude, Cursor, and other AI assistants via the built-in MCP server.

Read the blog post: "Automating My Second Brain with YouTube Transcripts"

Why ytranscript?

- No API keys required - Uses YouTube's public innertube API directly
- Works with AI assistants - Built-in MCP server for Claude, Cursor, and others
- Bulk processing - Process thousands of videos from Google Takeout exports
- Resume-safe - Automatically skips already-processed videos
- Multiple formats - JSON, JSONL, CSV, SRT, VTT, plain text

Quick Start

``bash

`Get a transcript in 10 seconds`


npx @nadimtuhin/ytranscript get dQw4w9WgXcQ
Output: "We're no strangers to love, you know the rules..."


Installation

`bash

`Global install (recommended for CLI usage)`


npm install -g @nadimtuhin/ytranscript
Or use with npx (no install)

npx @nadimtuhin/ytranscript get VIDEO_ID
Add to a project (for library usage)

npm add @nadimtuhin/ytranscript


Runtimes supported: Node.js 18+ and Bun 1.0+
MCP Server (AI Assistant Integration)
ytranscript includes an MCP (Model Context Protocol) server that lets Claude, Cursor, and other AI assistants fetch YouTube transcripts directly.
$3

| Tool | Description | |------|-------------| |get_transcript| Fetch transcript with format options (text, segments, srt, vtt) | |get_transcript_languages| List available caption languages for a video | |extract_video_id| Extract video ID from various YouTube URL formats | |get_transcripts_bulk | Fetch transcripts for multiple videos at once |

`$3`

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

`json { "mcpServers": { "ytranscript": { "command": "npx", "args": ["-y", "@nadimtuhin/ytranscript", "mcp"] } } }`

Or if installed globally:

`json { "mcpServers": { "ytranscript": { "command": "ytranscript-mcp" } } }`

`$3`

Once configured, you can ask Claude:

- "Get the transcript for this YouTube video: https://youtube.com/watch?v=dQw4w9WgXcQ" - "Summarize the key points from this video" - "What languages are available for this video's captions?" - "Get transcripts for these 5 videos and compare their content"

`CLI Usage`

`$3`

`bash

`Basic usage (outputs plain text)`


ytranscript get dQw4w9WgXcQ
From URL

ytranscript get "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
With specific language

ytranscript get dQw4w9WgXcQ --lang es
Output as SRT subtitles

ytranscript get dQw4w9WgXcQ --format srt -o video.srt
Output as JSON with timestamps

ytranscript get dQw4w9WgXcQ --format json

$3

`bash ytranscript info dQw4w9WgXcQ

`Output:`


  en     English (auto-generated)

  es     Spanish

  fr     French

$3

`bash

`From Google Takeout exports`


ytranscript bulk \
  --history "Takeout/YouTube/history/watch-history.json" \
  --watch-later "Takeout/YouTube/playlists/Watch later-videos.csv" \
  --out-jsonl transcripts.jsonl \
  --out-csv transcripts.csv
From a list of video IDs

ytranscript bulk --videos "dQw4w9WgXcQ,jNQXAC9IVRw,9bZkp7q19f0"
From a file (one ID or URL per line)

ytranscript bulk --file videos.txt
Resume a previous run (skips already-processed videos)

ytranscript bulk --history watch-history.json --resume


$3
YouTube may rate-limit requests. Use these flags to control pacing:

`bash ytranscript bulk \ --history watch-history.json \ --concurrency 4 \ # Max concurrent requests (default: 4, safe: 1-8) --pause-after 10 \ # Pause after N requests (default: 10) --pause-ms 5000 # Pause duration in ms (default: 5000)`

Recommended for large batches: --concurrency 2 --pause-after 10 --pause-ms 5000

`$3`

Route requests through an HTTP proxy to avoid rate limiting or access from restricted networks:

`bash

`CLI with proxy`


ytranscript get dQw4w9WgXcQ --proxy http://localhost:8080
Bulk with proxy

ytranscript bulk --history watch-history.json --proxy http://user:pass@proxy.example.com:8080
With authentication

ytranscript get dQw4w9WgXcQ --proxy http://username:password@proxy:8080


Programmatic usage:

`typescript import { fetchTranscript } from '@nadimtuhin/ytranscript';

const transcript = await fetchTranscript('dQw4w9WgXcQ', { proxy: { url: 'http://localhost:8080', }, });`

> Proxy support inspired by ytfetcher

`Programmatic API`

`$3`

`typescript import { fetchTranscript } from '@nadimtuhin/ytranscript';

try { const transcript = await fetchTranscript('dQw4w9WgXcQ', { languages: ['en', 'es'], // Preference order includeAutoGenerated: true, });

console.log(transcript.text); // Full transcript text console.log(transcript.segments); // Array of { text, start, duration } console.log(transcript.language); // 'en' console.log(transcript.isAutoGenerated); // true/false } catch (error) { // See "Error Handling" section below console.error(error.message); }`

`$3`

`typescript import { loadWatchHistory, loadWatchLater, mergeVideoSources, processVideos, } from '@nadimtuhin/ytranscript';

// Load from Google Takeout const history = await loadWatchHistory('./watch-history.json'); const watchLater = await loadWatchLater('./watch-later.csv');

// Merge and deduplicate const videos = mergeVideoSources(history, watchLater);

// Process with progress callback const results = await processVideos(videos, { concurrency: 4, pauseAfter: 10, pauseDuration: 5000, onProgress: (completed, total, result) => { const status = result.transcript ? 'OK' : 'FAIL'; console.log([${completed}/${total}] ${result.meta.videoId}: ${status}); }, });

// Filter successful results const transcripts = results.filter((r) => r.transcript);`

`$3`

`typescript import { streamVideos, appendJsonl } from '@nadimtuhin/ytranscript';

for await (const result of streamVideos(videos, { concurrency: 4 })) { // Write each result immediately (resume-safe) await appendJsonl(result, 'output.jsonl'); }`

`$3`

`typescript import { fetchTranscript, formatSrt, formatVtt, formatText } from '@nadimtuhin/ytranscript'; import { writeFile } from 'fs/promises';

const transcript = await fetchTranscript('dQw4w9WgXcQ');

// SRT subtitles const srt = formatSrt(transcript); await writeFile('video.srt', srt);

// VTT subtitles const vtt = formatVtt(transcript); await writeFile('video.vtt', vtt);

// Plain text with timestamps const text = formatText(transcript, true); // [0:00] First line of transcript // [0:05] Second line...`

`Error Handling`

The library throws errors for various failure cases:

| Error Message | Cause | Solution | |---------------|-------|----------| |No captions available for this video | Video has no captions/subtitles | Check with ytranscript infofirst | |No suitable caption track found | Requested language not available | Use includeAutoGenerated: trueor different language | |Caption track is empty| Captions exist but have no content | Rare; try a different language | |HTTP 429| Rate limited by YouTube | Reduce concurrency, add pauses | |HTTP 403 | Video is private or region-locked | Cannot access this video |

`typescript try { const transcript = await fetchTranscript(videoId); } catch (error) { if (error.message.includes('No captions available')) { console.log('This video has no subtitles'); } else if (error.message.includes('429')) { console.log('Rate limited - slow down requests'); } }`

`Limitations`

| Scenario | Supported | |----------|-----------| | Public videos with captions | ✅ Yes | | Auto-generated captions | ✅ Yes | | Manual/community captions | ✅ Yes | | Private videos | ❌ No | | Age-restricted videos | ❌ No | | Live streams (while live) | ❌ No | | Premiere videos (before premiere) | ❌ No | | Region-locked videos | ❌ No (unless you're in the allowed region) |

`Google Takeout`

To export your YouTube data:

1. Go to Google Takeout 2. Deselect all, then select only "YouTube and YouTube Music" 3. Click "All YouTube data included" and select: - History → Watch history - Playlists (includes Watch Later) 4. Export and download 5. Extract the archive

The relevant files are: -Takeout/YouTube and YouTube Music/history/watch-history.json-Takeout/YouTube and YouTube Music/playlists/Watch later-videos.csv

`API Reference`

`$3`

`typescript interface Transcript { videoId: string; text: string; segments: TranscriptSegment[]; language: string; isAutoGenerated: boolean; }

interface TranscriptSegment { text: string; start: number; // seconds duration: number; // seconds }

interface WatchHistoryMeta { videoId: string; title?: string; url?: string; channel?: { name?: string; url?: string }; watchedAt?: string; source: 'history' | 'watch_later' | 'manual'; }

interface TranscriptResult { meta: WatchHistoryMeta; transcript: Transcript | null; error?: string; // Present when transcript is null }

interface FetchOptions { languages?: string[]; // Default: ['en'] timeout?: number; // Default: 30000 (ms) includeAutoGenerated?: boolean; // Default: true proxy?: ProxyConfig; // Optional proxy configuration }

interface ProxyConfig { url: string; // HTTP proxy URL (e.g., "http://user:pass@host:port") }

interface BulkOptions extends FetchOptions { concurrency?: number; // Default: 4 pauseAfter?: number; // Default: 10 pauseDuration?: number; // Default: 5000 (ms) skipIds?: Set; // Videos to skip onProgress?: (completed: number, total: number, result: TranscriptResult) => void; }``

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

- Report bugs via GitHub Issues
- Security issues: see SECURITY.md

License

MIT

ytranscript

Read the blog post: "Automating My Second Brain with YouTube Transcripts"

Why ytranscript?

Quick Start

``bash

`Get a transcript in 10 seconds`


npx @nadimtuhin/ytranscript get dQw4w9WgXcQ
Output: "We're no strangers to love, you know the rules..."


Installation

`bash

`Global install (recommended for CLI usage)`


npm install -g @nadimtuhin/ytranscript
Or use with npx (no install)

npx @nadimtuhin/ytranscript get VIDEO_ID
Add to a project (for library usage)

npm add @nadimtuhin/ytranscript


Runtimes supported: Node.js 18+ and Bun 1.0+
MCP Server (AI Assistant Integration)
ytranscript includes an MCP (Model Context Protocol) server that lets Claude, Cursor, and other AI assistants fetch YouTube transcripts directly.
$3

`$3`

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

`json { "mcpServers": { "ytranscript": { "command": "npx", "args": ["-y", "@nadimtuhin/ytranscript", "mcp"] } } }`

Or if installed globally:

`json { "mcpServers": { "ytranscript": { "command": "ytranscript-mcp" } } }`

`$3`

Once configured, you can ask Claude:

`CLI Usage`

`$3`

`bash

`Basic usage (outputs plain text)`


ytranscript get dQw4w9WgXcQ
From URL

ytranscript get "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
With specific language

ytranscript get dQw4w9WgXcQ --lang es
Output as SRT subtitles

ytranscript get dQw4w9WgXcQ --format srt -o video.srt
Output as JSON with timestamps

ytranscript get dQw4w9WgXcQ --format json

$3

`bash ytranscript info dQw4w9WgXcQ

`Output:`


  en     English (auto-generated)

  es     Spanish

  fr     French

$3

`bash

`From Google Takeout exports`


ytranscript bulk \
  --history "Takeout/YouTube/history/watch-history.json" \
  --watch-later "Takeout/YouTube/playlists/Watch later-videos.csv" \
  --out-jsonl transcripts.jsonl \
  --out-csv transcripts.csv
From a list of video IDs

ytranscript bulk --videos "dQw4w9WgXcQ,jNQXAC9IVRw,9bZkp7q19f0"
From a file (one ID or URL per line)

ytranscript bulk --file videos.txt
Resume a previous run (skips already-processed videos)

ytranscript bulk --history watch-history.json --resume


$3
YouTube may rate-limit requests. Use these flags to control pacing:

Recommended for large batches: --concurrency 2 --pause-after 10 --pause-ms 5000

`$3`

Route requests through an HTTP proxy to avoid rate limiting or access from restricted networks:

`bash

`CLI with proxy`


ytranscript get dQw4w9WgXcQ --proxy http://localhost:8080
Bulk with proxy

ytranscript bulk --history watch-history.json --proxy http://user:pass@proxy.example.com:8080
With authentication

ytranscript get dQw4w9WgXcQ --proxy http://username:password@proxy:8080


Programmatic usage:

`typescript import { fetchTranscript } from '@nadimtuhin/ytranscript';

const transcript = await fetchTranscript('dQw4w9WgXcQ', { proxy: { url: 'http://localhost:8080', }, });`

> Proxy support inspired by ytfetcher

`Programmatic API`

`$3`

`typescript import { fetchTranscript } from '@nadimtuhin/ytranscript';

try { const transcript = await fetchTranscript('dQw4w9WgXcQ', { languages: ['en', 'es'], // Preference order includeAutoGenerated: true, });

`$3`

`typescript import { loadWatchHistory, loadWatchLater, mergeVideoSources, processVideos, } from '@nadimtuhin/ytranscript';

// Load from Google Takeout const history = await loadWatchHistory('./watch-history.json'); const watchLater = await loadWatchLater('./watch-later.csv');

// Merge and deduplicate const videos = mergeVideoSources(history, watchLater);

// Filter successful results const transcripts = results.filter((r) => r.transcript);`

`$3`

`typescript import { streamVideos, appendJsonl } from '@nadimtuhin/ytranscript';

for await (const result of streamVideos(videos, { concurrency: 4 })) { // Write each result immediately (resume-safe) await appendJsonl(result, 'output.jsonl'); }`

`$3`

`typescript import { fetchTranscript, formatSrt, formatVtt, formatText } from '@nadimtuhin/ytranscript'; import { writeFile } from 'fs/promises';

const transcript = await fetchTranscript('dQw4w9WgXcQ');

// SRT subtitles const srt = formatSrt(transcript); await writeFile('video.srt', srt);

// VTT subtitles const vtt = formatVtt(transcript); await writeFile('video.vtt', vtt);

// Plain text with timestamps const text = formatText(transcript, true); // [0:00] First line of transcript // [0:05] Second line...`

`Error Handling`

The library throws errors for various failure cases:

`Limitations`

`Google Takeout`

To export your YouTube data:

The relevant files are: -Takeout/YouTube and YouTube Music/history/watch-history.json-Takeout/YouTube and YouTube Music/playlists/Watch later-videos.csv

`API Reference`

`$3`

`typescript interface Transcript { videoId: string; text: string; segments: TranscriptSegment[]; language: string; isAutoGenerated: boolean; }

interface TranscriptSegment { text: string; start: number; // seconds duration: number; // seconds }

interface WatchHistoryMeta { videoId: string; title?: string; url?: string; channel?: { name?: string; url?: string }; watchedAt?: string; source: 'history' | 'watch_later' | 'manual'; }

interface TranscriptResult { meta: WatchHistoryMeta; transcript: Transcript | null; error?: string; // Present when transcript is null }

interface ProxyConfig { url: string; // HTTP proxy URL (e.g., "http://user:pass@host:port") }

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

- Report bugs via GitHub Issues
- Security issues: see SECURITY.md

License

MIT