Analyze text against a word-frequency map: find rarest words, unknown words, and table data.
npm install rare-word-analyzerAnalyze text against a word-frequency map: find rarest words, unknown words, and table data (rank, frequency, Zipf, etc.). Used by Rare Word Finder.
``bash`
npm install rare-word-analyzer
Or use the script in this repo (no build step):
`html`
Parse a CSV string with columns word, log_frequency (first line = header). Returns a map { word: logFreq }.
- text (string): Input text.
- wordFrequencyMap (object): Map of word → log frequency (e.g. from parseFrequencyCSV).10
- options (object, optional):
- topRarest (number, default ): Number of rarest words to return.
Returns:
- rarestWords: Array of normalized words (rarest first).
- unknownWords: Array of display forms of words not in the frequency map (sorted).
- unknownSet: Set of normalized unknown words (for highlighting).
- tableData: Array of rows [rank, word, frequencyStr, inverseFreq, zipf] for the table.
`js`
const csv = 'word,log_frequency\nthe,-2.3\nrare,-12.1\n';
const freq = parseFrequencyCSV(csv);
const result = analyzeText('The rare word.', freq, { topRarest: 5 });
// result.rarestWords, result.unknownWords, result.tableData, result.unknownSet
If you maintain this package and want to publish it to npm so others can npm install rare-word-analyzer:
- Go to npmjs.com and sign up.
- Verify your email if required.
From your machine (not inside the package folder yet):
`bash`
npm login
Enter your npm username, password, and email when prompted. You can also use 2FA (recommended): npm 2FA.
The name in package.json is rare-word-analyzer. Check if it’s taken:
`bash`
npm view rare-word-analyzer
If you get 404, the name is available. If the name is taken, use a scoped package (see step 6).
Edit packages/rare-word-analyzer/package.json and add:
- author – e.g. "Your Name or "Your Name""https://github.com/ethanuser/rare-word-finder#readme"
- homepage – e.g. "url": "https://github.com/ethanuser/rare-word-finder/issues"
- bugs – e.g.
The repository and files fields are already set; files controls what gets published (only index.js and README.md by default).
From the repository root:
`bash`
cd packages/rare-word-analyzer
npm publish
Or from anywhere:
`bash`
npm publish --workspace=packages/rare-word-analyzer
only if you have a root package.json with "workspaces": ["packages/*"]. Otherwise use the cd method.
- First publish: The package will be public (unless you use a private registry or scoped package with restricted access).
- Success: You’ll see something like + rare-word-analyzer@1.0.0 and the package will be at https://www.npmjs.com/package/rare-word-analyzer.
In package.json, change the name to your scope (your npm username):
`json`
"name": "@your-npm-username/rare-word-analyzer"
Then publish with public access (scoped packages are private by default):
`bash`
npm publish --access public
Users will install with:
`bash`
npm install @your-npm-username/rare-word-analyzer
After changing the code:
1. Bump the version in packages/rare-word-analyzer/package.json:1.0.0
- Patch (bug fixes): → 1.0.11.0.0
- Minor (new features, backward compatible): → 1.1.01.0.0
- Major (breaking changes): → 2.0.0
Or use the CLI from the package directory:
`bash`
cd packages/rare-word-analyzer
npm version patch # 1.0.0 -> 1.0.1
npm version minor # 1.0.0 -> 1.1.0
npm version major # 1.0.0 -> 2.0.0
2. Publish again:
`bash`
npm publish
To list the files that will be included in the tarball:
`bash`
cd packages/rare-word-analyzer
npm pack --dry-run
Only files listed in the "files" array in package.json (plus package.json and README.md` by default) are included.
---
MIT