Truncate HTML and Keep Tags
npm install truncate-htmlNotice This is a node module depends on cheerio _can only run on nodejs_. If you need a browser version, you may consider truncate or nodejs-html-truncate.
`` javascript`
const truncate = require('truncate-html')
truncate('
Hello from earth!
// =>
Hello from ...
npm install truncate-html yarn add truncate-html
or
Click
`ts
/**
* custom node strategy, default to Cheerio
'remove' to remove the node
'keep' to keep the node(and anything inside it) anyway, and won't be counted as there is no text content in it
Cheerio
undefined or any falsy value to truncate original node
*/
type ICustomNodeStrategy = (node: Cheerio
/**
* truncate-html full options object
*/
interface IFullOptions {
/**
* remove all tags, default false
*/
stripTags: boolean
/**
* ellipsis sign, default '...'
*/
ellipsis: string
/**
* decode html entities(e.g. convert & to &) before counting length, default falsebyWords
*/
decodeEntities: boolean
/**
* elements' selector you want ignore
*/
excludes: string | string[]
/**
* custom node strategy, default to Cheerio
'remove' to remove the node
'keep' to keep the node(and anything inside it) anyway, and won't be counted as there is no text content in it
Cheerio
undefined or any falsy value to truncate original node
*/
customNodeStrategy: ICustomNodeStrategy
/**
* how many letters(words if is true) you want reservelength
*/
length: number
/**
* if true, length means how many words to reserve
*/
byWords: boolean
/**
* how to deal with when truncate in the middle of a word
* 1. by default, just cut at that position.
* 2. set it to true, with max exceed 10 letters can exceed to reserver the last word
* 3. set it to a positive number decide how many letters can exceed to reserve the last word
* 4. set it to negative number to remove the last word if cut in the middle.
*/
reserveLastWord: boolean | number
/**
* if reserveLastWord set to negative number, and there is only one word in the html string, when trimTheOnlyWord set to true, the extra letters will be sliced if word's length longer than .
* see issue #23 for more details
*/
trimTheOnlyWord: boolean
/**
* keep whitespaces, by default continuous paces will
* be replaced with one space, set it true to keep them
*/
keepWhitespaces: boolean
}
/**
* options interface for function
*/
type IOptions = Partial
function truncate(html: string | CheerioAPI, length?: number | IOptions, truncateOptions?: IOptions): string
// and truncate.setup to change default options
truncate.setup(options: IOptions): void
`
`js`
{
stripTags: false,
ellipsis: '...',
decodeEntities: false,
excludes: '',
byWords: false,
reserveLastWord: false,
trimTheOnlyWord: false,
keepWhitespaces: false
}
You can change default options by using truncate.setup
e.g.
` ts`
truncate.setup({ stripTags: true, length: 10 })
truncate('
Hello from earth!
// => Hello from
or use existing cheerio instance
` tsdecodeEntities
import * as cheerio from 'cheerio'
truncate.setup({ stripTags: true, length: 10 })
// truncate option will not workisDocument
// you should config it in cheerio options by yourself
const $ = cheerio.load('
Hello from earth!
/* set decodeEntities if you need it /
decodeEntities: true
/ any cheerio instance options/
}, false) // third parameter is for option, set to false to get rid of extra wrappers, see cheerio's doc for details`
truncate($)
// => Hello from
This lib is written with typescript and has a type definition file along with it. ~~You may need to update your tsconfig.json by adding "esModuleInterop": true to the compilerOptions if you encounter some typing errors, see #19.~~
`ts
import truncate, { type IOptions } from 'truncate-html'
const html = '
italicboldThis is a string
const options: IOptions = {
length: 10,
byWords: true
}
truncate(html, options)
// =>
italicbold...
$3
In complex html string, you may want to keep some special elements and truncate the others. You can use customNodeStrategy to achieve this:
* return 'remove' to remove the node
* keep to keep the node(and anything inside it) anyway, and won't be counted as there is no text content in it
* Cheerio to truncate the returned node, or any falsy value to truncate the original node.`ts
import truncate, { type IOptions, type ICustomNodeStrategy } from 'truncate-html'// argument node is a cheerio instance
const customNodeStrategy: ICustomNodeStrategy = node => {
// remove img tag
if (node.is('img')) {
return 'remove'
}
// keep italic tag and its children
if (node.is('i')) {
return 'keep'
}
// truncate summary tag that inside details tag instead of details tag
if (node.is('details')) {
return node.find('summary')
}
}
const html = '
italicboldClick me
Some details
This is a string for test.'const options: IOptions = {
length: 10,
customNodeStrategy
}
truncate(html, options)
// =>
italicboldClick me
Some details
Th...
`$3
If the html string content's length is shorter than
options.length, then no ellipsis will be appended to the final html string. If longer, then the final string length will be options.length + options.ellipsis. And if you set reserveLastWord to true or none zero number or using customNodeStrategy, the final string will be various.$3
All html comments
will be removed$3
When dealing with none alphabetic languages, such as Chinese/Japanese/Korean, they don't separate words with whitespaces, so options
byWords and reserveLastWord should only works well with alphabetic languages.And the only dependency of this project
cheerio has an issue when dealing with none alphabetic languages, see Known Issues for details.$3
If you want to use existing cheerio instance, truncate option
decodeEntities will not work, you should set it in your own cheerio instance:`js
var html = '
This is a string
for test.'
const $ = cheerio.load(${html}, {
decodeEntities: true
/* other cheerio options /
}, false) // third parameter is for isDocument option, set to false to get rid of extra wrappers, see cheerio's doc for details
truncate($, 10)`Examples
`javascript
var truncate = require('truncate-html')// truncate html
var html = '
This is a string
for test.'
truncate(html, 10)
// returns:
This is a ...
// truncate string with emojis
var string = '
poo 💩💩💩💩💩
'
truncate(string, 6)
// returns:
poo 💩💩...
// with options, remove all tags
var html = '
This is a string
for test.'
truncate(html, 10, { stripTags: true })
// returns: This is a ...// with options, truncate by words.
// if you try to truncate none alphabet language(like CJK)
// it will not act as you wish
var html = '
This is a string
for test.'
truncate(html, 3, { byWords: true })
// returns:
This is a ...
// with options, keep whitespaces
var html = '
This is a string
for test.'
truncate(html, 10, { keepWhitespaces: true })
// returns:
This is a ...
// combine length and options
var html = '
This is a string
for test.'
truncate(html, {
length: 10,
stripTags: true
})
// returns: This is a ...// custom ellipsis sign
var html = '
This is a string
for test.'
truncate(html, {
length: 10,
ellipsis: '~'
})
// returns:
This is a ~
// exclude some special elements(by selector), they will be removed before counting content's length
var html = '
This is a string
for test.'
truncate(html, {
length: 10,
ellipsis: '~',
excludes: 'img'
})
// returns: This is a ~
// exclude more than one category elements
var html =
'
This is a string
unwanted string inserted ( ´•̥̥̥ω•̥̥̥ ) for test.'
truncate(html, {
length: 20,
stripTags: true,
ellipsis: '~',
excludes: ['img', '.something-unwanted']
})
// returns: This is a string for~// handing encoded characters
var html = '
test for <p> encoded string
'
truncate(html, {
length: 20,
decodeEntities: true
})
// returns: test for <p> encode...
// when set decodeEntities false
var html = '
test for <p> encoded string
'
truncate(html, {
length: 20,
decodeEntities: false // this is the default value
})
// returns: test for <p...
// and there may be a surprise by setting decodeEntities to true when handing CJK characters
var html = '
test for <p> 䏿–‡ string
'
truncate(html, {
length: 20,
decodeEntities: true
})
// returns: test for <p> 中文 str...
// to fix this, see below for instructions
// custom node strategy to keep some special elements
var html = '
italicboldThis is a string
for test.'
truncate(html, {
length: 10,
customNodeStrategy: node => {
if (node.is('img')) {
return 'remove'
}
if (node.is('i')) {
return 'keep'
}
}
})
// returns: italicboldThis is a ...
// custom node strategy to truncate summary instead of original node
var html = '
Click me
Some details
other things'
truncate(html, {
length: 10,
customNodeStrategy: node => {
if (node.is('details')) {
return node.find('summary')
}
}
})
// returns: Click me
Some details
ot...
```for More usages, check truncate.spec.ts
Credits
Thanks to:
- @calebeno es6 support and unit tests
- @aaditya-thakkar emoji truncating support