Official SDK for DealCrawl web scraping, crawling and AI agent API
npm install @dealcrawl/sdkOfficial TypeScript SDK for the DealCrawl web scraping and crawling API.



Postprocessors Resource (client.postprocessors.*)
Discover and check site-specific enrichment postprocessors (Amazon, Fnac, eBay):
``typescript
// List all available postprocessors
const list = await client.postprocessors.list();
console.log(list.postprocessors);
// [{ name: "amazon", domains: [...], extractedFields: [...] }, ...]
// Check if a URL will be enriched
const check = await client.postprocessors.check({
url: "https://www.amazon.fr/dp/B09V3KXJPB"
});
console.log(check.hasPostprocessor); // true
console.log(check.postprocessor); // "amazon"
`
Usage Resource (client.usage.*)
Monitor usage, quotas, and LLM token consumption:
`typescript
// Get current usage and quotas
const usage = await client.usage.current();
console.log(usage.usage.scrapes); // 150
console.log(usage.quotas.scrapes); // 10000
console.log(usage.percentUsed.scrapes); // 1.5
// Get LLM token usage
const tokens = await client.usage.tokens({ days: 30 });
console.log(tokens.totals.estimatedCostUsd); // 12.34
// Get daily token breakdown
const daily = await client.usage.dailyTokens({ days: 7 });
`
Enable multi-modal vision for better page understanding:
`typescript`
const job = await client.agent.create({
url: "https://complex-site.com",
prompt: "Navigate and extract data",
enableVision: true, // NEW: Enable vision-based navigation
visionOptions: { // NEW: Configure vision behavior
quality: 80, // JPEG quality (1-100)
annotateElements: true, // Label clickable elements
captureFrequency: "on_stuck" // When to capture: every_step | on_demand | on_stuck
}
});
Control scraping behavior with new options:
`typescript`
const job = await client.scrape.create({
url: "https://example.com",
// NEW: Control postprocessor execution
runPostprocessors: true, // Enable/disable site enrichment (default: true)
// NEW: Force browser-based scraping (uses 'render' quota)
forceDynamic: false,
// NEW: Force stealth mode for anti-bot sites
forceStealth: false,
});
---
All missing API endpoints now have SDK methods (completes the 87% → 100% alignment):
Status Resource (client.status.*)
- getJobErrors(jobId, options?) - Get job errors without loading full results (useful for debugging large crawls)
Data Resource (client.data.*)
- getJob(jobId) - Get full job details with resultgetJobResult(jobId)
- - Get only the result of a completed jobexportJob(jobId, format)
- - Export job in multiple formats (json, markdown, llm, csv)
Webhooks Resource (client.webhooks.*)
- rotateSecret(webhookId, options) - Rotate webhook secret with grace period supportgetSecretStatus(webhookId)
- - Check secret version and grace period statusverifySignature(options)
- - Verify webhook signature and replay protection
`typescript
// Get job errors for debugging
const errors = await client.status.getJobErrors("job_abc123", { limit: 50 });
// Export crawl results as Markdown
const markdown = await client.data.exportJob("crawl_123", "markdown");
// Rotate webhook secret with 24h grace period
await client.webhooks.rotateSecret("webhook_abc", {
newSecret: "new-secure-key",
gracePeriodHours: 24
});
`
---
- DataResource: Fixed syntax error in getDealsByCategory() method (unclosed docstring + duplicate line)
- SDK-API Alignment: Verified 87% endpoint coverage with detailed alignment report
The following API endpoints do not have SDK methods yet (see API-SDK Alignment Report):
- GET /v1/status/:jobId/errors - Get job errorsGET /v1/data/jobs/:jobId
- - Get full job detailsGET /v1/data/jobs/:jobId/result
- - Get job resultGET /v1/data/jobs/:jobId/export
- - Export job in multiple formatsPOST /v1/webhooks/:id/rotate
- - Rotate webhook secretGET /v1/webhooks/:id/secret-status
- - Get webhook secret statusPOST /v1/webhooks/verify
- - Verify webhook signature
These methods will be added in a future release.
---
- SearchOptions: maxResults → limit, autoScrape → scrapeResults, autoScrapeLimit → maxScrapeResultsdelay
- BatchScrapeOptions: → delayMsclaude-3-5-haiku-20241022
- ExtractModel: Updated to match API (, claude-3-5-sonnet-20241022, etc.)scrape:batch
- ApiKeyScope: Removed and search (use scrape scope for both)
- 📸 Screenshot Storage (SEC-011) - Private by default with configurable signed URL TTL
- 🎯 Priority Crawl System - 3-tier queue system (high/medium/low) based on SmartFrontier deal scores
- 🤖 AI Deal Extraction - LLM-powered extraction with customizable score thresholds
- 📝 Markdown Output - Convert scraped content to clean Markdown with GFM support
- 🎬 Browser Actions - Execute preset actions (click, scroll, write, etc.) before scraping
- 🔴 Real-Time SSE Events - Track jobs in real-time with Server-Sent Events (browser only)
- 🛡️ Batch Scrape - Added ignoreInvalidURLs for Firecrawl-compatible error handlingclient.convert.htmlToMarkdown()
- 🔄 HTML to Markdown - New utility
- 🚀 Full API Coverage - Access all 50+ DealCrawl API endpoints
- 📦 Zero Dependencies - Uses native fetch, works everywherewaitForResult()
- 🔒 Type-Safe - Complete TypeScript definitions
- ⚡ Automatic Retries - Built-in retry logic with exponential backoff
- 🔄 Polling Helpers - for async job completion
- 🎯 Resource Pattern - Stripe/Twilio-style API design
`bashnpm
npm install @dealcrawl/sdk
Quick Start
`typescript
import { DealCrawl } from "@dealcrawl/sdk";const client = new DealCrawl({
apiKey: process.env.DEALCRAWL_API_KEY!,
});
// Scrape a single page with deal extraction and screenshot
const job = await client.scrape.create({
url: "https://shop.example.com/product",
extractDeal: true,
screenshot: { enabled: true },
outputMarkdown: true, // NEW: Get clean markdown output
});
// Wait for result with automatic polling
const result = await client.waitForResult(job.jobId);
console.log(result.data.parsed.markdown); // Markdown content
console.log(result.data.screenshot); // Public screenshot URL
`Real-Time Events (SSE) - Browser Only 🔴
Track jobs in real-time using Server-Sent Events (SSE). Browser only - for Node.js, use polling via
client.waitForResult().`typescript
// 1. Generate SSE token (required for EventSource)
const { token, expiresAt } = await client.auth.generateSSEToken();
console.log(Token expires at: ${expiresAt}); // 5 minutes// 2. Subscribe to all events
const eventSource = client.events.subscribe(token, {
onEvent: (event) => {
console.log('Event:', event.type);
const data = JSON.parse(event.data);
console.log('Data:', data);
},
onError: (error) => {
console.error('SSE error:', error);
}
});
// 3. Listen for specific event types
eventSource.addEventListener('job.completed', (event) => {
const data = JSON.parse(event.data);
console.log('Job completed!', data.summary);
eventSource.close(); // Clean up
});
eventSource.addEventListener('job.progress', (event) => {
const data = JSON.parse(event.data);
console.log(
Progress: ${data.progress}%);
});eventSource.addEventListener('deal.found', (event) => {
const data = JSON.parse(event.data);
console.log('Deal found!', data.title, data.score);
});
// 4. Subscribe to specific job only
const job = await client.scrape.create({ url: "https://example.com" });
const jobToken = await client.auth.generateSSEToken({ jobId: job.jobId });
const jobEvents = client.events.subscribeToJob(job.jobId, jobToken.token, {
onEvent: (event) => {
const data = JSON.parse(event.data);
console.log(
[${event.type}], data);
}
});// 5. Check connection limits before subscribing
const limits = await client.auth.getLimits();
console.log(
Available SSE connections: ${limits.sse.available}/${limits.sse.maxConnections});
// Free: 2 concurrent, Pro: 10 concurrent, Enterprise: 50 concurrent// 6. Helper: Wait for completion via SSE
const result = await client.events.waitForCompletion(job.jobId, (progress) => {
console.log(
Progress: ${progress}%);
});
`Available Event Types:
| Event Type | Description |
| ---------- | ----------- |
|
job.created | Job was created |
| job.queued | Job entered queue |
| job.started | Worker picked up job |
| job.progress | Progress update (includes progress, stats, eta) |
| job.status | Status changed |
| job.completed | Job finished successfully |
| job.failed | Job failed (includes error details) |
| job.cancelled | Job was cancelled |
| job.log | Important log message |
| job.metric | Performance/business metric |
| job.alert | Important alert (quota warning, etc.) |
| job.checkpoint | Checkpoint saved (for resumable jobs) |
| deal.found | Deal detected during crawl |
| deal.validated | Deal scored/validated |
| ping | Keepalive (every 15 seconds) |
| connection.open | SSE connection established |
| connection.close | SSE connection closing |
| error | Error occurred |TypeScript Support:
`typescript
import type {
SSEEvent,
JobProgressEvent,
JobCompletedEvent,
DealFoundEvent
} from "@dealcrawl/sdk";// Type-safe event handling
eventSource.addEventListener('job.progress', (event: MessageEvent) => {
const data = JSON.parse(event.data) as JobProgressEvent['data'];
console.log(
Progress: ${data.progress}%);
console.log(ETA: ${data.eta?.remainingFormatted});
console.log(Deals found: ${data.stats?.dealsFound});
});eventSource.addEventListener('job.completed', (event: MessageEvent) => {
const data = JSON.parse(event.data) as JobCompletedEvent['data'];
console.log('Completed in:', data.durationMs, 'ms');
console.log('Summary:', data.summary);
});
`Features:
- ✅ Automatic reconnection on disconnect
- ✅ Event replay via
Last-Event-ID (up to 50 missed events)
- ✅ Keepalive pings every 15 seconds
- ✅ Max connection time: 1 hour (auto-reconnect after)
- ✅ Multi-tenant isolation (only see your events)
- ✅ Token-based auth (works with EventSource)Security:
- Tokens expire after 5 minutes
- Tokens can be restricted to specific jobs
- Tokens stored in Redis (revocable)
- Connection limits per tier (Free: 2, Pro: 10, Enterprise: 50)
January 2026 Features in Detail
$3
Private by default with configurable signed URL expiration:
`typescript
// Basic screenshot (private with tier-specific TTL)
const job = await client.scrape.create({
url: "https://example.com",
screenshot: {
enabled: true,
fullPage: true,
format: "webp",
quality: 85,
signedUrlTtl: 604800, // 7 days (default for Pro/Enterprise)
},
});const result = await client.waitForResult(job.jobId);
console.log(result.data.screenshotMetadata);
// {
// url: "https://...supabase.co/storage/v1/object/sign/screenshots-private/...",
// isPublic: false,
// expiresAt: "2026-01-25T12:00:00Z",
// width: 1280,
// height: 720,
// format: "webp",
// sizeBytes: 125000
// }
// Refresh signed URL before expiration
const refreshed = await client.screenshots.refresh({
path: "job_abc123/1234567890_nanoid_example.png",
ttl: 604800 // Extend for another 7 days
});
console.log(refreshed.url); // New signed URL
console.log(refreshed.expiresAt); // "2026-02-01T12:00:00Z"
// Get tier-specific TTL limits
const limits = await client.screenshots.getLimits();
console.log(limits);
// {
// tier: "pro",
// limits: { min: 3600, max: 604800, default: 604800 },
// formattedLimits: { min: "1 hour", max: "7 days", default: "7 days" }
// }
// Enterprise: Public URLs (opt-in)
const jobPublic = await client.scrape.create({
url: "https://example.com",
screenshot: {
enabled: true,
publicUrl: true, // ⚠️ Enterprise only - exposes data publicly
},
});
// → Public URL without expiration (Enterprise tier only)
`Security Note: Screenshots are private by default to prevent exposure of personal data, copyrighted content, or sensitive tokens. Public URLs require Enterprise tier + explicit opt-in.
$3
3-tier queue system automatically prioritizes high-value pages:
`typescript
// Crawl with automatic prioritization
const job = await client.crawl.create({
url: "https://shop.example.com",
extractDeal: true,
minDealScore: 50, // Only extract deals scoring 50+
});// Behind the scenes:
// - Pages scoring 70+ → High priority queue (5 workers, 30/min)
// - Pages scoring 40-69 → Medium priority queue (10 workers, 60/min)
// - Pages scoring <40 → Low priority queue (20 workers, 120/min)
`$3
Extract deals with LLM-powered analysis:
`typescript
// Extract deals during crawl
const job = await client.crawl.create({
url: "https://marketplace.example.com",
extractDeal: true,
minDealScore: 30, // Only extract if score >= 30
maxPages: 200,
});// Get extracted deals
const deals = await client.status.getDeals(job.jobId, {
minScore: 70, // Filter for high-quality deals
limit: 50,
});
console.log(deals.deals); // Array of ExtractedDeal objects
`$3
Automatic enrichment for major e-commerce sites. When scraping Amazon, Fnac, or eBay, you get structured site-specific data:
`typescript
// Scrape an Amazon product
const result = await client.scrapeAndWait({
url: "https://www.amazon.fr/dp/B0EXAMPLE",
extractDeal: true,
});// Site-specific data automatically extracted
console.log(result.data.siteData);
// {
// site: "amazon",
// productId: "B0EXAMPLE",
// seller: {
// name: "Official Store",
// rating: 4.8,
// reviewCount: 12543,
// isPrime: true
// },
// shipping: {
// free: true,
// prime: true,
// estimatedDelivery: "Tomorrow"
// },
// availability: {
// inStock: true,
// stockLevel: "in_stock",
// quantity: 50
// },
// condition: "new",
// buyBox: {
// seller: "Official Store",
// price: 29.99,
// isPrime: true
// },
// coupon: {
// discount: "10%",
// autoApplied: true
// },
// subscribeAndSave: {
// available: true,
// discountPercent: 15,
// price: 25.49
// },
// categories: ["Electronics", "Accessories", "Headphones"]
// }
// Check which postprocessors ran
console.log(result.data.postprocessorsUsed);
// ["amazon"]
`Amazon-Specific Fields:
`typescript
// Amazon product with all enrichments
const result = await client.scrapeAndWait({
url: "https://www.amazon.com/dp/B0EXAMPLE",
});if (result.data.siteData?.site === "amazon") {
const { siteData } = result.data;
// Prime eligibility
console.log("Prime:", siteData.shipping?.prime);
console.log("Free shipping:", siteData.shipping?.free);
// Buy Box winner
if (siteData.buyBox) {
console.log(
Best price: $${siteData.buyBox.price} from ${siteData.buyBox.seller});
} // Subscribe & Save
if (siteData.subscribeAndSave?.available) {
console.log(
S&S: ${siteData.subscribeAndSave.discountPercent}% off);
} // Lightning deals
if (siteData.flashDeal?.active) {
console.log(
Flash deal ends: ${siteData.flashDeal.endsAt});
console.log(${siteData.flashDeal.percentClaimed}% claimed);
} // Coupons
if (siteData.coupon) {
console.log(
Coupon: ${siteData.coupon.discount});
}
}
`Fnac-Specific Fields:
`typescript
// Fnac product
const result = await client.scrapeAndWait({
url: "https://www.fnac.com/product/12345",
});if (result.data.siteData?.site === "fnac") {
const { siteData } = result.data;
// Fnac Pro sellers
console.log("Pro Seller:", siteData.seller?.isProSeller);
// Express delivery
console.log("Express available:", siteData.shipping?.expressAvailable);
// Product condition
console.log("Condition:", siteData.condition);
// "new" | "refurbished" | "used" | "like_new" | "acceptable"
// Product variations
if (siteData.variations) {
siteData.variations.forEach(v => {
console.log(
${v.type}: ${v.options.join(", ")});
});
}
}
`eBay-Specific Fields:
`typescript
// eBay listing
const result = await client.scrapeAndWait({
url: "https://www.ebay.com/itm/123456789",
});if (result.data.siteData?.site === "ebay") {
const { siteData } = result.data;
// Top-rated seller badge
console.log("Top Rated:", siteData.seller?.topRated);
// Item condition
console.log("Condition:", siteData.condition);
// Stock level
if (siteData.availability) {
console.log("In stock:", siteData.availability.inStock);
console.log("Quantity:", siteData.availability.quantity);
}
}
`Supported Domains:
| Site | Domains |
|--------|-------------------------------------------------------------------------------------------|
| Amazon | amazon.com, amazon.fr, amazon.de, amazon.co.uk, amazon.es, amazon.it, amazon.ca, +12 more |
| Fnac | fnac.com, fnac.be, fnac.ch, fnac.pt, fnac.es |
| eBay | ebay.com, ebay.fr, ebay.de, ebay.co.uk, ebay.es, ebay.it, ebay.ca, ebay.com.au, +6 more |
SiteSpecificData Type:
`typescript
import type { SiteSpecificData } from "@dealcrawl/sdk";interface SiteSpecificData {
site: string; // "amazon" | "fnac" | "ebay"
productId?: string; // ASIN, SKU, or item ID
seller?: {
name: string;
url?: string;
rating?: number;
reviewCount?: number;
isPrime?: boolean; // Amazon
isProSeller?: boolean; // Fnac
topRated?: boolean; // eBay
};
shipping?: {
free: boolean;
estimatedDelivery?: string;
prime?: boolean; // Amazon Prime
expressAvailable?: boolean; // Fnac Express
};
availability?: {
inStock: boolean;
stockLevel?: "in_stock" | "low_stock" | "out_of_stock" | "preorder";
quantity?: number;
message?: string;
};
condition?: "new" | "refurbished" | "used" | "like_new" | "acceptable";
buyBox?: { // Amazon only
seller: string;
price: number;
isPrime: boolean;
};
coupon?: {
code?: string;
discount: string;
autoApplied: boolean;
};
flashDeal?: { // Lightning deals
active: boolean;
endsAt?: string;
percentClaimed?: number;
};
subscribeAndSave?: { // Amazon only
available: boolean;
discountPercent?: number;
price?: number;
};
categories?: string[];
variations?: Array<{
type: string;
options: string[];
selected?: string;
}>;
rawData?: Record;
}
`$3
Convert HTML to clean, structured markdown:
`typescript
// Single page markdown
const job = await client.scrape.create({
url: "https://blog.example.com/article",
outputMarkdown: true,
markdownBaseUrl: "https://blog.example.com", // Resolve relative URLs
onlyMainContent: true,
});const result = await client.waitForResult(job.jobId);
console.log(result.data.parsed.markdown);
// Clean markdown with:
// - GFM tables, strikethrough, task lists
// - Code blocks with syntax detection
// - Absolute URLs
// - Noise removal (ads, navigation)
`$3
Execute actions before scraping for dynamic content:
`typescript
// Handle cookie popups and load more content
const job = await client.scrape.create({
url: "https://shop.example.com/products",
actions: [
{ type: "click", selector: "#accept-cookies", optional: true },
{ type: "wait", milliseconds: 500 },
{ type: "scroll", direction: "down", amount: 500 },
{ type: "click", selector: ".load-more", retries: 3 },
{ type: "wait", selector: ".products-loaded" },
],
extractMultipleDeals: true,
});// Search and extract
const job2 = await client.scrape.create({
url: "https://marketplace.com",
actions: [
{ type: "write", selector: "input[name='search']", text: "laptop deals" },
{ type: "press", key: "Enter" },
{ type: "wait", selector: ".results" },
],
extractMultipleDeals: true,
maxDeals: 30,
});
`Configuration
`typescript
const client = new DealCrawl({
apiKey: "sk_xxx", // Required
baseUrl: "https://api.dealcrawl.dev", // Optional (default)
timeout: 30000, // Request timeout in ms
maxRetries: 3, // Retry attempts
retryDelay: 1000, // Base retry delay in ms
onRateLimit: (info) => console.log(info), // Rate limit callback
});
`Resources
$3
`typescript
// Basic scrape
const job = await client.scrape.create({
url: "https://example.com",
detectSignals: true,
});// With deal extraction
const job = await client.scrape.extractDeal("https://shop.example.com/sale");
// With screenshot
const job = await client.scrape.withScreenshot("https://example.com", {
format: "webp",
fullPage: true,
});
`Options:
| Option | Type | Default | Description |
| ---------------------- | -------- | -------- | --------------------------------------------------------- |
|
url | string | required | URL to scrape |
| noStore | boolean | false | Zero Data Retention - don't save results (Pro/Enterprise) |
| detectSignals | boolean | true | Detect prices, discounts, urgency |
| extractDeal | boolean | false | Extract deal information |
| extractMultipleDeals | boolean | false | Extract multiple deals from list pages |
| maxDeals | number | 20 | Max deals to extract (max: 50) |
| extractWithAI | boolean | false | Use AI for extraction |
| useAdvancedModel | boolean | false | Use GPT-4o (higher cost) |
| minDealScore | number | 0 | Minimum deal score (0-100) |
| screenshot | object | - | Screenshot options |
| excludeTags | string[] | - | HTML tags to exclude |
| excludeSelectors | string[] | - | CSS selectors to exclude |
| onlyMainContent | boolean | true | Extract main content only |
| headers | object | - | Custom HTTP headers |
| timeout | number | 30000 | Request timeout in ms (max: 120000) |
| outputMarkdown | boolean | false | Convert content to Markdown (GFM) |
| markdownBaseUrl | string | - | Base URL for resolving relative URLs in markdown |
| actions | array | - | Browser actions to execute before scraping |
| runPostprocessors | boolean | true | Run site-specific postprocessors (Amazon, Fnac, eBay) |
| forceDynamic | boolean | false | Force browser-based scraping (uses 'render' quota) |
| forceStealth | boolean | false | Force stealth mode for anti-bot sites |$3
`typescript
// Scrape multiple URLs in one request (1-100 URLs)
const batch = await client.scrape.batch({
urls: [
{ url: "https://shop1.com/product1" },
{ url: "https://shop2.com/deal", extractDeal: true },
{ url: "https://shop3.com/sale", screenshot: { enabled: true } },
],
defaults: {
detectSignals: true,
timeout: 30000,
},
delayMs: 500, // ✨ Was: delay
ignoreInvalidURLs: true, // ✨ NEW: Skip invalid URLs instead of failing
});// Get batch status
const status = await client.scrape.getBatchStatus(batch.batchId);
// Wait for all batch jobs
const results = await client.waitForAll(batch.jobIds);
`Batch Options:
| Option | Type | Default | Description |
| ------------------ | ------- | -------- | ---------------------------------------------------- |
|
urls | array | required | 1-100 URL objects with optional overrides |
| defaults | object | - | Default options applied to all URLs |
| priority | number | 5 | Priority 1-10 (higher = faster) |
| delayMs | number | 0 | Delay between URLs (0-5000ms) |
| webhookUrl | string | - | Webhook for batch completion |
| ignoreInvalidURLs| boolean | false | Continue on invalid URLs (Firecrawl-compatible) |$3
`typescript
// Basic search
const job = await client.search.create({
query: "laptop deals black friday",
limit: 20, // ✨ Was: maxResults
});// AI-optimized search with deal scoring
const job = await client.search.create({
query: "iPhone discount",
useAiOptimization: true,
aiProvider: "openai",
aiModel: "gpt-4o-mini",
useDealScoring: true,
});
// Search with auto-scraping of results
const job = await client.search.create({
query: "promo codes electronics",
scrapeResults: true, // ✨ Was: autoScrape
maxScrapeResults: 5, // ✨ Was: autoScrapeLimit
});
// Filtered search
const job = await client.search.create({
query: "software deals",
filters: {
location: "fr",
language: "fr",
dateRange: "month",
domain: "amazon.fr", // Single domain filter
},
});
// Check search API status
const status = await client.search.getStatus();
// Convenience: search and wait
const result = await client.searchAndWait({
query: "gaming laptop deals",
useDealScoring: true,
});
`Search Options:
| Option | Type | Default | Description |
| ------------------- | ------- | -------- | ----------------------------------------------- |
|
query | string | required | Search query |
| limit | number | 10 | Results to return (1-100) |
| useAiOptimization | boolean | false | AI-enhance the query |
| aiProvider | string | "openai" | "openai" or "anthropic" |
| aiModel | string | - | Model ID (gpt-4o-mini, claude-3-5-sonnet, etc.) |
| useDealScoring | boolean | false | Score results for deal relevance |
| scrapeResults | boolean | false | Auto-scrape top results |
| maxScrapeResults | number | 5 | Number of results to scrape (1-10) |
| filters | object | - | Location, language, date, domain |$3
`typescript
// Basic crawl
const job = await client.crawl.create({
url: "https://shop.example.com",
maxDepth: 3,
maxPages: 100,
extractDeal: true,
});// Using templates
const job = await client.crawl.withTemplate("ecommerce", {
url: "https://shop.example.com",
});
// Analyze before crawling
const analysis = await client.crawl.analyze("https://shop.example.com");
console.log(analysis.recommendedTemplate);
console.log(analysis.estimatedPages);
// Find deals (convenience method)
const job = await client.crawl.forDeals("https://shop.example.com", {
minDealScore: 70,
});
// Advanced crawl with filtering
const job = await client.crawl.create({
url: "https://marketplace.example.com",
maxDepth: 4,
maxPages: 500,
extractDeal: true,
minDealScore: 50,
categories: ["software", "courses"],
priceRange: { min: 0, max: 100 },
onlyHighQuality: true,
webhookUrl: "https://my-server.com/crawl-updates",
syncToDealup: true,
});
// Enterprise: priority queue override
const job = await client.crawl.create({
url: "https://time-sensitive-deals.com",
priority: "high", // Enterprise only
onlyHighQuality: true,
});
`Available Templates:
-
ecommerce - Product pages and online stores
- marketplace - Multi-vendor marketplaces
- blog - Blog posts and articles
- docs - Documentation sites
- custom - No preset, use your own settingsCrawl Options:
| Option | Type | Default | Description |
| ------------------ | -------- | -------- | ---------------------------------------------------- |
|
url | string | required | Starting URL |
| maxDepth | number | 3 | Max crawl depth (1-5) |
| maxPages | number | 100 | Max pages to crawl (1-1000) |
| detectSignals | boolean | true | Detect prices, discounts |
| extractDeal | boolean | false | Extract deal info with AI |
| minDealScore | number | 30 | Min deal score threshold (0-100) |
| categories | array | - | Filter: courses, software, physical, services, other |
| priceRange | object | - | Filter: { min, max } price |
| onlyHighQuality | boolean | false | Only deals scoring 70+ |
| allowedMerchants | string[] | - | Only these merchants |
| blockedMerchants | string[] | - | Exclude these merchants |
| webhookUrl | string | - | Real-time notifications URL |
| syncToDealup | boolean | false | Auto-sync to DealUp |
| template | string | - | Job template to use |
| useSmartRouting | boolean | true | Auto-detect best settings |
| priority | string | - | Queue priority (Enterprise only) |
| requireJS | boolean | false | Force JavaScript rendering |
| bypassAntiBot | boolean | false | Advanced anti-bot techniques |
| outputMarkdown | boolean | false | Convert pages to Markdown (GFM) |
| markdownBaseUrl | string | - | Base URL for relative links in markdown |
| noStore | boolean | false | Zero Data Retention (Pro/Enterprise only) |$3
`typescript
// Schema-based extraction
const job = await client.extract.withSchema("https://example.com/product", {
type: "object",
properties: {
name: { type: "string" },
price: { type: "number" },
features: { type: "array", items: { type: "string" } },
},
});// Prompt-based extraction
const job = await client.extract.withPrompt(
"https://example.com/article",
"Extract the article title, author, and main points"
);
// Pre-built extractors
const job = await client.extract.product("https://shop.example.com/item");
const job = await client.extract.article("https://blog.example.com/post");
const job = await client.extract.contact("https://example.com/contact");
`$3
`typescript
// Basic dork search
const job = await client.dork.create({
query: "discount coupon",
site: "amazon.com",
maxResults: 50,
});// Find deals on a site
const job = await client.dork.findDeals("amazon.com");
// Find products
const job = await client.dork.findProducts("shop.example.com");
// Find PDFs
const job = await client.dork.findPDFs("docs.example.com", "user guide");
// Build query string (for preview)
const query = client.dork.buildQuery({
query: "laptop deals",
site: "amazon.com",
inTitle: "discount",
});
// Returns: "laptop deals site:amazon.com intitle:discount"
`$3
Create AI agents that can navigate websites, interact with elements, and extract structured data using natural language instructions.
`typescript
// Basic agent - navigate and extract data
const job = await client.agent.create({
url: "https://amazon.com",
prompt:
"Search for wireless headphones under $50 and extract the top 5 results",
schema: {
type: "object",
properties: {
products: {
type: "array",
items: {
type: "object",
properties: {
name: { type: "string" },
price: { type: "number" },
rating: { type: "number" },
},
},
},
},
},
maxSteps: 15,
});// Wait for result
const result = await client.agentAndWait({
url: "https://booking.com",
prompt: "Find hotels in Paris for 2 adults, March 15-17",
takeScreenshots: true,
});
// Generate schema from natural language (helper)
const schemaResult = await client.agent.generateSchema({
prompt: "Find student deals on marketing courses with price and discount",
});
// Returns: { schema, refinedPrompt, confidence, suggestedQuestions? }
// Use generated schema
const job = await client.agent.create({
url: "https://coursera.org",
prompt: schemaResult.refinedPrompt,
schema: schemaResult.schema,
});
// Preset actions (handle popups, cookies, etc.)
const job = await client.agent.withPresetActions(
"https://shop.com",
"Find the best discounts",
[
{ type: "click", selector: "#accept-cookies" },
{ type: "wait", milliseconds: 1000 },
]
);
// Deal-focused agent with pre-built schema
const job = await client.agent.forDeals(
"https://slickdeals.net",
"Find the top 10 tech deals posted today"
);
// Use Claude instead of GPT
const job = await client.agent.withClaude(
"https://complex-site.com",
"Navigate the checkout process"
);
`Agent Options:
| Option | Type | Default | Description |
| ----------------- | ------- | -------- | --------------------------------------------- |
|
url | string | required | Starting URL |
| prompt | string | required | Natural language instructions (10-2000 chars) |
| schema | object | - | JSON Schema for structured output |
| maxSteps | number | 10 | Maximum navigation steps (max: 25) |
| actions | array | - | Preset actions to execute first |
| model | string | "openai" | LLM provider: "openai" or "anthropic" |
| timeout | number | 30000 | Per-step timeout in ms (max: 60000) |
| takeScreenshots | boolean | false | Capture screenshot at each step |
| onlyMainContent | boolean | true | Extract main content only |
| enableVision | boolean | tier | Enable vision-based navigation (screenshots for LLM) |
| visionOptions | object | - | Vision capture configuration (see below) |Vision Options:
| Option | Type | Default | Description |
| ------------------- | ------ | ------- | ------------------------------------------------ |
|
quality | number | tier | JPEG quality for screenshots (1-100) |
| annotateElements | boolean| false | Label clickable elements on screenshots |
| captureFrequency | string | tier | When to capture: every_step, on_demand, on_stuck |Action Types:
| Action | Key Parameters | Description |
|--------------|---------------------------------------------------|--------------------------|
|
click | selector, waitAfter?, button?, force? | Click an element |
| scroll | direction, amount?, smooth? | Scroll page/to element |
| write | selector, text, clearFirst?, typeDelay? | Type text into input |
| wait | milliseconds?, selector?, condition? | Wait for time or element |
| press | key, modifiers? | Press keyboard key |
| screenshot | fullPage?, selector?, name? | Capture screenshot |
| hover | selector, duration? | Hover over element |
| select | selector, value, byLabel? | Select dropdown option |Action Resilience (all actions support):
-
optional: boolean - Don't fail job if action fails
- retries: number - Retry failed action (1-5 times)
- delayBefore: number - Delay before executing action (ms)Schema Generation:
`typescript
// Generate JSON Schema from natural language
const schemaResult = await client.agent.generateSchema({
prompt: "Find e-commerce product deals with prices and discounts",
context: {
domains: ["e-commerce", "retail"], // Help AI understand context
dataTypes: ["prices", "discounts"], // Expected data types
format: "json", // Output format
clarifications: ["Include shipping info"] // Additional requirements
},
});// Use the generated schema
const job = await client.agent.create({
url: "https://shop.example.com",
prompt: schemaResult.refinedPrompt, // AI-improved prompt
schema: schemaResult.schema, // Generated JSON Schema
});
// Check confidence - if low, ask clarifying questions
if (schemaResult.confidence < 0.7) {
console.log("Consider clarifying:", schemaResult.suggestedQuestions);
}
`$3
`typescript
// Get job status
const status = await client.status.get(jobId);// Get deals from a job
const deals = await client.status.getDeals(jobId, {
minScore: 70,
limit: 20,
});
// Resume a failed/paused job
const resumed = await client.status.resume(jobId);
// Get job metrics
const metrics = await client.status.getMetrics(jobId);
// Cancel a job
await client.status.cancel(jobId);
// Convenience methods
const isComplete = await client.status.isComplete(jobId);
const succeeded = await client.status.succeeded(jobId);
const result = await client.status.getResult(jobId);
`$3
`typescript
// List jobs with filtering
const jobs = await client.data.listJobs({
status: "completed",
type: "crawl",
page: 1,
limit: 20,
sortBy: "created_at",
sortOrder: "desc",
});// List deals with filtering
const deals = await client.data.listDeals({
minScore: 70,
category: "electronics",
sortBy: "deal_score",
});
// Get top deals
const topDeals = await client.data.getTopDeals(20, 80);
// Export data
const jsonExport = await client.data.exportDeals({ format: "json" });
const csvExport = await client.data.exportDeals({ format: "csv" });
// Get statistics
const stats = await client.data.getStats();
`$3
`typescript
// Create a webhook
const webhook = await client.webhooks.create({
event: "deal.found",
url: "https://my-server.com/webhooks/deals",
secret: "my-webhook-secret",
minDealScore: 70,
});// List webhooks
const webhooks = await client.webhooks.list();
// Test a webhook
const result = await client.webhooks.test(webhookId);
// Enable/disable
await client.webhooks.enable(webhookId);
await client.webhooks.disable(webhookId);
// Delete
await client.webhooks.delete(webhookId);
`Events:
-
deal.found - New deal discovered
- deal.synced - Deal synced to DealUp
- crawl.completed - Crawl job finished
- crawl.failed - Crawl job failed$3
Manage screenshot signed URLs with configurable TTL and automatic refresh:
`typescript
// Refresh a signed URL before expiration
const refreshed = await client.screenshots.refresh({
path: "job_abc123/1234567890_nanoid_example.png",
ttl: 604800 // Optional: 7 days (defaults to tier default)
});
console.log(refreshed.url); // New signed URL
console.log(refreshed.expiresAt); // "2026-01-25T12:00:00Z"
console.log(refreshed.tierLimits); // { min: 3600, max: 604800, default: 604800 }// Get tier-specific TTL limits
const limits = await client.screenshots.getLimits();
console.log(limits.tier); // "pro"
console.log(limits.limits); // { min: 3600, max: 604800, default: 604800 }
console.log(limits.formattedLimits); // { min: "1 hour", max: "7 days", default: "7 days" }
// Specify custom bucket (defaults to 'screenshots-private')
const refreshed = await client.screenshots.refresh({
path: "job_xyz/screenshot.png",
ttl: 86400, // 1 day
bucket: "screenshots-private"
});
`TTL Limits by Tier:
| Tier | Min TTL | Max TTL | Default TTL |
|------------|---------|---------|-------------|
| Free | 1 hour | 24 hours| 24 hours |
| Pro | 1 hour | 7 days | 7 days |
| Enterprise | 1 hour | 30 days | 7 days |
Security Note: All screenshots are private by default. Public URLs (Enterprise only) don't require refresh as they don't expire.
$3
`typescript
// List all keys
const keys = await client.keys.list();// Create a new key with proper scopes
const newKey = await client.keys.create({
name: "Production Key",
scopes: ["scrape", "crawl", "status", "data:read"],
expiresInDays: 365,
});
// ⚠️ Save newKey.key immediately - it won't be shown again!
// Rotate a key
const rotated = await client.keys.rotate(keyId, {
newName: "Production Key v2",
});
// Revoke a key with reason
await client.keys.revoke(keyId, {
reason: "Key compromised - rotating for security",
});
// Get key stats
const stats = await client.keys.getStats(keyId, { days: 30 });
`Available Scopes:
| Scope | Endpoint | Description |
| ----------------- | --------------------------------- | ------------------------- |
|
scrape | POST /v1/scrape, /v1/scrape/batch | Create scrape jobs |
| crawl | POST /v1/crawl | Create crawl jobs |
| dork | POST /v1/dork | Create dork searches |
| extract | POST /v1/extract | Create extraction jobs |
| agent | POST /v1/agent | Create AI agent jobs |
| status | GET /v1/status/:id | Read job status |
| data:read | GET /v1/data/* | Read jobs/deals |
| data:export | GET /v1/data/export | Export data |
| keys:manage | /v1/keys | Manage API keys |
| webhooks:manage | /v1/webhooks | Manage webhooks |Scope Examples:
`typescript
// Read-only monitoring key
await client.keys.create({
name: "Monitoring Dashboard",
scopes: ["status", "data:read"],
});// Production scraping key
await client.keys.create({
name: "Scraper Service",
scopes: ["scrape", "status", "data:read"],
});
// Full access (admin)
await client.keys.create({
name: "Admin Key",
scopes: [
"scrape",
"crawl",
"dork",
"extract",
"agent",
"status",
"data:read",
"data:export",
"keys:manage",
"webhooks:manage",
],
});
`$3
Discover which postprocessors are available and check URL applicability:
`typescript
// List all available postprocessors
const list = await client.postprocessors.list();
console.log(list.postprocessors);
// [
// {
// name: "amazon",
// domains: ["amazon.com", "amazon.fr", "amazon.de", ...],
// extractedFields: ["productId", "seller", "shipping", "availability", ...],
// description: "Amazon product page enrichment..."
// },
// ...
// ]
console.log(list.totalDomains); // 30+ supported domains// Check if a specific URL will be enriched
const amazonCheck = await client.postprocessors.check({
url: "https://www.amazon.fr/dp/B09V3KXJPB"
});
console.log(amazonCheck.hasPostprocessor); // true
console.log(amazonCheck.postprocessor); // "amazon"
console.log(amazonCheck.extractedFields); // ["productId", "seller", ...]
// Check unsupported URL
const genericCheck = await client.postprocessors.check({
url: "https://example.com/product"
});
console.log(genericCheck.hasPostprocessor); // false
console.log(genericCheck.message); // "No postprocessor available..."
`Disable postprocessors for faster scraping:
`typescript
const job = await client.scrape.create({
url: "https://amazon.com/dp/B123",
runPostprocessors: false, // Skip enrichment for speed
});
`$3
Monitor your usage, quotas, and LLM token consumption:
`typescript
// Get current usage and quotas
const usage = await client.usage.current();
console.log(usage.tier); // "pro"
console.log(usage.usage.scrapes); // 150
console.log(usage.quotas.scrapes); // 10000
console.log(usage.percentUsed.scrapes); // 1.5
console.log(usage.billingPeriod.daysRemaining); // 15// Get historical usage (for billing analysis)
const history = await client.usage.history({ months: 6 });
history.data.forEach(period => {
console.log(
${period.periodStart}: ${period.usage.scrapes} scrapes);
});// Get LLM token usage summary
const tokens = await client.usage.tokens({ days: 30 });
console.log(tokens.totals.totalTokens); // 1234567
console.log(tokens.totals.estimatedCostUsd); // 12.34
tokens.breakdown.forEach(item => {
console.log(
${item.provider}/${item.model}: ${item.totalTokens} tokens);
});// Filter by provider
const openaiTokens = await client.usage.tokens({
days: 30,
provider: "openai"
});
// Get daily token breakdown (for charts)
const daily = await client.usage.dailyTokens({ days: 7 });
daily.daily.forEach(day => {
console.log(
${day.date}: ${day.totalTokens} tokens ($${day.estimatedCostUsd}));
});
`$3
`typescript
// Get account info
const account = await client.account.get();
console.log(account.tier); // "free" | "pro" | "enterprise"
console.log(account.usage);// Get metrics
const metrics = await client.account.getMetrics();
// Get recommendations
const recommendations = await client.account.getRecommendations();
// Update preferences
await client.account.updatePreferences({
minDealScore: 70,
autoSync: true,
preferredCategories: ["software", "courses"],
});
// Convenience methods
const remaining = await client.account.getRemainingQuota("scrapes");
const hasQuota = await client.account.hasQuota("crawls", 5);
const isPremium = await client.account.isPremium();
`Polling & Waiting
`typescript
// Wait for a single job
const result = await client.waitForResult(jobId, {
pollInterval: 2000, // Check every 2 seconds
timeout: 300000, // 5 minute timeout
onProgress: (status) => console.log(Progress: ${status.progress}%),
onStatusChange: (newStatus, oldStatus) => {
console.log(Status changed: ${oldStatus} → ${newStatus});
},
});// Wait for multiple jobs
const results = await client.waitForAll([jobId1, jobId2, jobId3]);
// Wait for any job to complete
const firstResult = await client.waitForAny([jobId1, jobId2, jobId3]);
// Convenience: scrape and wait
const result = await client.scrapeAndWait({
url: "https://example.com",
extractDeal: true,
});
// Convenience: crawl and wait
const result = await client.crawlAndWait({
url: "https://shop.example.com",
maxPages: 50,
});
`Field Selection
Reduce response payload size by selecting only the fields you need:
`typescript
// Select specific fields from job status
const status = await client.status.get(jobId, {
fields: ["id", "status", "progress", "result.title", "result.url"],
});// Select fields from deals list
const deals = await client.data.listDeals({
minScore: 70,
fields: ["id", "title", "price", "discount", "dealScore"],
});
// Nested field selection
const jobs = await client.data.listJobs({
fields: ["id", "status", "result.deals.title", "result.deals.price"],
});
// Agent job field selection
const agentStatus = await client.status.get(agentJobId, {
fields: [
"id",
"status",
"data.extractedData", // Final extracted data
"data.steps.action", // Just action details (skip observations)
"data.totalSteps",
],
});
// Markdown content selection
const scrapeResult = await client.status.get(scrapeJobId, {
fields: ["id", "status", "result.parsed.markdown", "result.parsed.title"],
});
`Benefits:
- 85-90% payload reduction for large responses
- Faster API responses
- Lower bandwidth usage
Supported Endpoints:
-
GET /v1/status/:jobId
- GET /v1/status/:jobId/deals
- GET /v1/status/:jobId/metrics
- GET /v1/data/jobs
- GET /v1/data/deals
- GET /v1/data/:jobIdError Handling
`typescript
import { DealCrawl, DealCrawlError, ERROR_CODES } from "@dealcrawl/sdk";try {
const result = await client.scrape.create({ url: "..." });
} catch (error) {
if (error instanceof DealCrawlError) {
console.log(error.code); // e.g., "RATE_LIMIT_EXCEEDED"
console.log(error.statusCode); // HTTP status code
console.log(error.message); // Human-readable message
console.log(error.details); // Additional details
// Check error type
if (error.isRateLimited()) {
console.log(
Retry after ${error.retryAfter} seconds);
}
if (error.isAuthError()) {
console.log("Check your API key");
}
if (error.isRetryable()) {
// Automatic retry was attempted
}
}
}
`Error Codes:
-
INVALID_API_KEY - API key is invalid or missing
- RATE_LIMIT_EXCEEDED - Too many requests
- QUOTA_EXCEEDED - Monthly quota exceeded
- JOB_NOT_FOUND - Job ID doesn't exist
- JOB_TIMEOUT - Job didn't complete in time
- FETCH_FAILED - Network request failedTypeScript Types
All types are exported for your convenience:
`typescript
import type {
// Configuration
DealCrawlConfig, // Request Options
ScrapeOptions,
BatchScrapeOptions,
CrawlOptions,
CrawlPriority,
CrawlCategory,
PriceRange,
SearchOptions,
ExtractOptions,
DorkOptions,
AgentOptions,
VisionOptions, // NEW: Vision configuration
VisionCaptureFrequency, // NEW: "every_step" | "on_demand" | "on_stuck"
SchemaGenerationOptions,
// Responses
JobStatusResponse,
ListDealsResponse,
DealItem,
AgentJobResponse,
AgentStatusResponse,
AgentResultResponse,
AgentMemoryStats, // NEW: Memory stats from agent sessions
SchemaGenerationResponse,
SearchJobResponse,
BatchScrapeResponse,
// Postprocessor Types (NEW)
PostprocessorInfo,
PostprocessorsListResponse,
PostprocessorCheckResponse,
// Usage Types (NEW)
CurrentUsageResponse,
UsageHistoryResponse,
TokenUsageResponse,
DailyTokenUsageResponse,
BillingPeriod,
QuotaLimits,
UsagePercentages,
// Action Types
ActionInput,
ClickAction,
ScrollAction,
WriteAction,
WaitAction,
PressAction,
HoverAction,
SelectAction,
// Screenshot Options & Responses
ScreenshotOptions,
ScreenshotResult,
RefreshScreenshotOptions,
ScreenshotRefreshResponse,
ScreenshotLimitsResponse,
// Re-exports from @dealcrawl/shared
ScrapeResult,
CrawlResult,
ExtractedDeal,
Signal,
ParsedPage, // Includes markdown field
SiteSpecificData, // Postprocessor enrichment data
} from "@dealcrawl/sdk";
`Examples
$3
`typescript
async function findBestDeals(siteUrl: string) {
const client = new DealCrawl({ apiKey: process.env.DEALCRAWL_API_KEY! }); // Crawl the site for deals
const job = await client.crawl.forDeals(siteUrl, {
maxPages: 200,
minDealScore: 60,
});
// Wait for completion
const result = await client.waitForResult(job.jobId, {
timeout: 600000, // 10 minutes
onProgress: (s) => console.log(
Crawled: ${s.progress}%),
}); if (result.status === "failed") {
throw new Error(result.error);
}
// Get the best deals
const deals = await client.status.getDeals(job.jobId, {
minScore: 80,
limit: 50,
});
return deals.deals.sort((a, b) => b.dealScore - a.dealScore);
}
`$3
`typescript
async function extractProduct(productUrl: string) {
const client = new DealCrawl({ apiKey: process.env.DEALCRAWL_API_KEY! }); const result = await client.extractAndWait({
url: productUrl,
schema: {
type: "object",
properties: {
name: { type: "string" },
price: { type: "number" },
originalPrice: { type: "number" },
discount: { type: "string" },
rating: { type: "number" },
reviews: { type: "number" },
availability: { type: "string" },
features: { type: "array", items: { type: "string" } },
},
required: ["name", "price"],
},
model: "gpt-4o-mini",
});
return result.result;
}
`$3
`typescript
async function setupDealMonitoring(webhookUrl: string) {
const client = new DealCrawl({ apiKey: process.env.DEALCRAWL_API_KEY! }); // Create webhook for high-score deals
await client.webhooks.create({
event: "deal.found",
url: webhookUrl,
minDealScore: 85,
categories: ["software", "courses"],
});
// Set up preferences
await client.account.updatePreferences({
minDealScore: 70,
webhookEnabled: true,
});
console.log("Deal monitoring configured!");
}
`Browser Usage
The SDK works in browsers with native
fetch support:`typescript
// In a browser environment
import { DealCrawl } from "@dealcrawl/sdk";const client = new DealCrawl({
apiKey: "your-api-key", // ⚠️ Don't expose keys in client-side code!
});
`> Warning: Never expose your API key in client-side code. Use a backend proxy or edge function.
Migration Guide (v2.10.x → v2.11.0)
$3
`diff
const result = await client.search.create({
query: "laptop deals",
- maxResults: 20,
+ limit: 20,
- autoScrape: true,
+ scrapeResults: true,
- autoScrapeLimit: 5,
+ maxScrapeResults: 5,
});
`$3
`diff
const batch = await client.scrape.batch({
urls: [...],
- delay: 500,
+ delayMs: 500,
+ ignoreInvalidURLs: true, // NEW: Firecrawl-compatible
});
`$3
`diff
const job = await client.extract.create({
url: "...",
- model: "claude-3-haiku",
+ model: "claude-3-5-haiku-20241022",
});
`$3
`diff
await client.keys.create({
name: "My Key",
scopes: [
"scrape",
- "scrape:batch", // REMOVED - use "scrape" instead
- "search", // REMOVED - use "scrape" instead
"crawl",
"status",
],
});
`Compatibility
- Node.js: 18.0+
- Bun: All versions
- Browser: Modern browsers with
fetch` supportBy @Shipfastgo
MIT © DealUp