Normalize URLs with configurable options for consistent URL handling
npm install @tpmjs/tools-url-normalizeNormalize URLs with configurable options for consistent URL handling.
``bash`
npm install @tpmjs/tools-url-normalize
`typescript
import { urlNormalizeTool } from '@tpmjs/tools-url-normalize';
// Use with AI SDK
const result = await urlNormalizeTool.execute({
url: 'HTTPS://WWW.Example.COM:443/path/?z=1&a=2#section',
options: {
sortParams: true,
removeHash: true,
lowercase: true,
removeTrailingSlash: true,
removeDefaultPort: true,
removeWWW: true,
},
});
console.log(result.normalized);
// 'https://example.com/path?a=2&z=1'
console.log(result.changes);
// [
// { type: 'lowercase', description: '...', before: 'HTTPS:', after: 'https:' },
// { type: 'lowercase', description: '...', before: 'WWW.Example.COM', after: 'www.example.com' },
// { type: 'removeWWW', description: '...', before: 'www.example.com', after: 'example.com' },
// { type: 'removeDefaultPort', description: '...', before: '443', after: '' },
// { type: 'sortParams', description: '...', before: '?z=1&a=2', after: '?a=2&z=1' },
// { type: 'removeHash', description: '...', before: '#section', after: '' },
// ]
`
- Lowercase conversion: Converts protocol and hostname to lowercase
- Trailing slash removal: Removes trailing slashes from pathnames
- Query parameter sorting: Sorts query parameters alphabetically
- Hash removal: Optionally removes URL fragments
- Default port removal: Removes standard ports (80, 443, 21)
- WWW removal: Optionally removes www subdomain
- Change tracking: Reports all transformations applied
- url (string, required): The URL to normalizeoptions
- (object, optional): Normalization optionssortParams
- (boolean, default: true): Sort query parameters alphabeticallyremoveHash
- (boolean, default: false): Remove URL fragment/hashlowercase
- (boolean, default: true): Convert protocol and hostname to lowercaseremoveTrailingSlash
- (boolean, default: true): Remove trailing slash from pathnameremoveDefaultPort
- (boolean, default: true): Remove default portsremoveWWW
- (boolean, default: false): Remove www subdomain
`typescript`
{
normalized: string;
original: string;
changes: Array<{
type: string;
description: string;
before: string;
after: string;
}>;
metadata: {
protocol: string;
hostname: string;
pathname: string;
hasQueryParams: boolean;
hasHash: boolean;
};
}
- URL deduplication in web crawlers
- Canonical URL generation for SEO
- Link comparison and matching
- Database indexing of URLs
- Analytics tracking consistency
`typescript
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';
import { urlNormalizeTool } from '@tpmjs/tools-url-normalize';
const result = await generateText({
model: openai('gpt-4'),
tools: {
urlNormalize: urlNormalizeTool,
},
prompt: 'Normalize this URL and remove the hash: https://Example.com/PATH/?b=2&a=1#top',
});
``
MIT