A plugin for playwright & puppeteer to route proxies dynamically.
npm install @extra/proxy-router> A plugin for [playwright-extra] and [puppeteer-extra] to route proxies dynamically.
``bash`
yarn add @extra/proxy-router- or -
npm install @extra/proxy-router
Playwright
If this is your first [playwright-extra] plugin here's everything you need:
`bash`
yarn add playwright playwright-extra @extra/proxy-router- or -
npm install playwright playwright-extra @extra/proxy-router
Puppeteer
If this is your first [puppeteer-extra] plugin here's everything you need:
`bash`
yarn add puppeteer puppeteer-extra @extra/proxy-router- or -
npm install puppeteer puppeteer-extra @extra/proxy-router
| 💫 | 
Chromium | 
Chrome | 
Firefox | 
Webkit |
| :--------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| [playwright-extra] | ✅ | ✅ | ✅ | ✅ |
| [puppeteer-extra] | ✅ | ✅ | 🕒 | - |
| Headless | Headful | Launch | Connect |
| :------: | :-----: | :----: | :------------------------------: |
| ✅ | ✅ | ✅ | ✅ (local) |
The plugin makes using proxies in the browser a lot more convenient:
- Handles proxy authentication
- Multiple proxies can be used
- Flexible proxy routing using the host/domain
- Change proxies dynamically after browser launch
- Collect traffic stats per proxy or host
- Uses native browser features, no performance loss
> Using puppeteer? To use the following playwright examples simply change your imports
A single proxy for all browser connections
`js
// playwright-extra is a drop-in replacement for playwright,
// it augments the installed playwright with plugin functionality
// Note: Instead of chromium you can use firefox and webkit as well.
const { chromium } = require('playwright-extra')
// Configure and add the proxy router plugin with a default proxy
const ProxyRouter = require('@extra/proxy-router')
chromium.use(
ProxyRouter({
proxies: { DEFAULT: 'http://user:pass@proxyhost:port' },
})
)
// That's it, the default proxy will be used and proxy authentication handled automatically
chromium.launch({ headless: false }).then(async (browser) => {
const page = await browser.newPage()
await page.goto('https://canhazip.com', { waitUntil: 'domcontentloaded' })
const ip = await page.evaluate('document.body.innerText')
console.log('Outbound IP:', ip)
await browser.close()
})
`
Use multiple proxies and route connections flexibly
`js
// playwright-extra is a drop-in replacement for playwright,
// it augments the installed playwright with plugin functionality
// Note: Instead of chromium you can use firefox and webkit as well.
const { chromium } = require('playwright-extra')
// Configure the proxy router plugin
const ProxyRouter = require('@extra/proxy-router')
const proxyRouter = ProxyRouter({
// define the available proxies (replace this with your proxies)
proxies: {
// the default browser proxy, can be null as well for direct connectionsrouteByHost
DEFAULT: 'http://user:pass@proxyhost:port',
// optionally define more proxies you can use in DEFAULT
// you can use whatever names you'd like for them
DATACENTER: 'http://user:pass@proxyhost2:port',
RESIDENTIAL_US: 'http://user:pass@proxyhost3:port',
},
// optional function for flexible proxy routing
// if this is not specified the proxy will be used for all connectionsDEFAULT
routeByHost: async ({ host }) => {
if (['pagead2.googlesyndication.com', 'fonts.gstatic.com'].includes(host)) {
return 'ABORT' // block connection to certain hosts
}
if (host.includes('google')) {
return 'DIRECT' // use a direct connection for all google domains
}
if (host.endsWith('.tile.openstreetmap.org')) {
return 'DATACENTER' // route heavy images through datacenter proxy
}
if (host === 'canhazip.com') {
return 'RESIDENTIAL_US' // special proxy for this domain
}
// everything else will use proxy
},
})
// Add the plugin
chromium.use(proxyRouter)
// Launch a browser and run some IP checks
chromium.launch({ headless: true }).then(async (browser) => {
const page = await browser.newPage()
await page.goto('https://showmyip.com/', { waitUntil: 'domcontentloaded' })
const ip1 = await page.evaluate("document.querySelector('#ipv4').innerText")
console.log('Outbound IP #1:', ip1)
// => 77.191.128.0 (the DEFAULT proxy)
await page.goto('https://canhazip.com', { waitUntil: 'domcontentloaded' })
const ip2 = await page.evaluate('document.body.innerText')
console.log('Outbound IP #2:', ip2)
// => 104.179.129.27 (the RESIDENTIAL_US proxy)
console.log(proxyRouter.stats.connectionLog) // list of connections (host => proxy name)
// { id: 0, proxy: 'DIRECT', host: 'accounts.google.com' },
// { id: 1, proxy: 'DEFAULT', host: 'www.showmyip.com' },
// { id: 2, proxy: 'ABORT', host: 'pagead2.googlesyndication.com' },
// { id: 3, proxy: 'DEFAULT', host: 'unpkg.com' },
// ...
console.log(proxyRouter.stats.byProxy) // bytes used by proxy
// {
// DATACENTER: 441734,
// DEFAULT: 125823,
// DIRECT: 100457,
// RESIDENTIAL_US: 4764,
// ABORT: 0
// }
console.log(proxyRouter.stats.byHost) // bytes used by host
// {
// 'a.tile.openstreetmap.org': 150685,
// 'c.tile.openstreetmap.org': 147054,
// 'b.tile.openstreetmap.org': 143995,
// 'unpkg.com': 57621,
// 'www.googletagmanager.com': 49572,
// 'www.showmyip.com': 40408,
// ...
await browser.close()
})
`
Usage with Puppeteer
> The code is essentially the same as the playwright example above. :-)
Just change the import and package name:
`diff`
- const { chromium } = require('playwright-extra')
+ const puppeteer = require('puppeteer-extra')
// ...
- chromium.use(proxyRouter)
+ puppeteer.use(proxyRouter)
// ...
- chromium.launch()
+ puppeteer.launch()
// ...
Typescript & ESM
> The plugin is written in Typescript and ships with types.
Playwright:
`js`
// You can use any browser: chromium, firefox, webkit
import { firefox } from 'playwright-extra'
import ProxyRouter from '@extra/proxy-router'
// ...
firefox.use(proxyRouter)
Puppeteer:
`js`
import puppeteer from 'puppeteer-extra'
import ProxyRouter from '@extra/proxy-router'
// ...
puppeteer.use(proxyRouter)
If you'd like to see debug output just run your script like so:
`bashmacOS/Linux (Bash)
DEBUG=proxy-router node myscript.js
How it works
The proxy router will launch a local proxy server and instruct the browser to use it.
That local proxy server will in turn connect to the configured upstream proxy servers and relay connections depending on the optional user-defined routing function, while handling upstream proxy authentication and a few other things.
API
$3
`ts
export interface ProxyRouterOpts {
/**
* A dictionary of proxies to be made available to the browser and router.
*
* An optional entry named DEFAULT will be used for all requests, unless overriden by routeByHost.
* If the DEFAULT entry is omitted no proxy will be used by default.
*
* The value of an entry can be a string (format: http://user:pass@proxyhost:port) or null (direct connection).
* Proxy authentication is handled automatically by the router.
*
* @example
* proxies: {
* DEFAULT: "http://user:pass@proxyhost:port", // use this proxy by default
* RESIDENTIAL_US: "http://user:pass@proxyhost2:port" // use this for specific hosts with routeByHost
* }
*/
proxies?: {
/**
* The default proxy for the browser (format: http://user:pass@proxyhost:port),
* if omitted or null no proxy will be used by default
*/
DEFAULT?: string | null
/**
* Any other custom proxy names which can be used for routing later
* (e.g. 'DATACENTER_US': 'http://user:pass@proxyhost:port')
*/
[key: string]: string | null
} /**
* An optional function to allow proxy routing based on the target host of the request.
*
* A return value of nothing,
null or DEFAULT will result in the DEFAULT proxy being used as configured.
* A return value of DIRECT will result in no proxy being used.
* A return value of ABORT will cancel/block this request.
*
* Any other string as return value is assumed to be a reference to the configured proxies dict.
*
* @note The browser will most often establish only a single proxy connection per host.
*
* @example
* routeByHost: async ({ host }) => {
* if (host.includes('google')) { return "DIRECT" }
* return 'RESIDENTIAL_US'
* }
*
*/
routeByHost?: RouteByHostFn
/* Collect traffic and connection stats, default: true /
collectStats?: boolean
/* Don't print any proxy connection errors to stderr, default: false /
muteProxyErrors?: boolean
/* Suppress proxy errors for specific hosts /
muteProxyErrorsForHost?: string[]
/* Options for the local proxy-chain server /
proxyServerOpts?: ProxyServerOpts
/**
* Optionally exempt hosts from going through a proxy, even our internal routing proxy.
*
* Examples:
* .com or chromium.org or .domain.com
*
* @see
* https://chromium.googlesource.com/chromium/src/+/HEAD/net/docs/proxy.md#proxy-bypass-rules
* https://www-archive.mozilla.org/quality/networking/docs/aboutno_proxy_for.html
*/
proxyBypassList?: string[]
}
`Alternatives
$3
- Only supported in chromium in headful mode
- Despite the name (
FindProxyForURL`) can only route by host- Advantage: Route proxies by page not host
- They rely on a massive hack: Using Node.js to send the requests instead of the browser
- Will change the TLS fingerprint, error prone
- Uses CDP request interception which is chromium only
- Increased latency and resource overhead
Copyright © 2018 - 2023, berstend̡̲̫̹̠̖͚͓̔̄̓̐̄͛̀͘. Released under the MIT License.
[playwright-extra]: https://github.com/berstend/puppeteer-extra/tree/master/packages/playwright-extra
[puppeteer-extra]: https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra