Extracts repeating elements from the DOM onto a more friendly json format
npm install cypress-harvester!Biulds

A life enhancing plug-in for Cypress allowing you to easliy work with html Using npm: `` Enable this plugin by adding this line to your project's cypress/support/commands.js ` Given a simple html table below: | Created | Account Id | Account Holder | Balance | When table is passed through the Cypress harvester, Cypress is able to easily extract data and convert to a json representation of the table: ` Given a simple set of repeating elements below: ` When the set of repeating elements is passed through to the scrapeElements ` This plugin also allows for scraping of data from tables which can be persisted to a json file: ` ` // validate a record exists in the table // validate a record does not exist in the table // check the tables' column labels // test whether column(s) are sorted // find records matching a search term for a given property expect(mappedProperties).to.deep.eq([ In order to assert date columns, you will need to specify the column index for the data to be interpreted as a date data type. The underlying data will be converted to unix epoch format to allow sort assertions. ` ` Take advantage to the fixtures within Cypress when dealing with large datasets. Provide the index of numeric columns (starting at zero), in the example above. The balance column index is 3, when this value is supplied to the plug-in, we are able to validate the total sum of the column against an expected value. | Property | Default Value | Type | Purpose | This plugin doesn't wait for data to appear in your table. This can be achieved by adding guards into your code prior to calling scrapeTable Examples: Use your application's api to sync against ` Add a have.length.above ` Contributionsare more than welcome. Go nuts. Solution in this stackoverflow question [MIT][mit] [cypress]: https://cypress.ioelements and repeating elements, whether it be for test assertions or for web scarping purposes.
Installing
bash`
$ npm install cypress-harvester --save-dev:javascript`
import 'cypress-harvester'Example - Testing static
|---------------------|------------|----------------------|--------:|
| 10-04-2021 13:40:17 | UA-11876-3 | Terrell E. Evert | $33 |
| 10-04-2021 12:00:17 | UA-10876-1 | James L. Silver | $50.5 |
| 10-04-2021 13:00:17 | UA-10346-1 | Christian A. Lavalle | $-22.98 |
`html`
Created
Account Id
Account Holder
Balance
10-04-2021 13:40:17
UA-11876-3
Terrell E. Evert
$33
10-04-2021 12:00:17
UA-10876-1
James L. Silver
$50.5
10-04-2021 13:00:17
UA-10346-1
Christian A. Lavalle
$-22.98
javascript`
cy.get('#example')
.scrapeTable()
.then((table) => {
expect(table.getData()).to.deep.eq([
{
created: '10-04-2021 13:40:17',
account_id: 'UA-11876-3',
account_holder: 'Terrell E. Evert',
balance: '$33',
},
{
created: '10-04-2021 12:00:17',
account_id: 'UA-10876-1',
account_holder: 'James L. Silver',
balance: '$50.5',
},
{
created: '10-04-2021 13:00:17',
account_id: 'UA-10346-1',
account_holder: 'Christian A. Lavalle',
balance: '$-22.98',
},
]);
});Example - Testing repeating elements
html`
iPad Pro
11-inch Liquid Retina Display
$829.99
iPad Air
64 GB Wi-Fi + Cellular
$599.99
iPad mini
256 GB Wi-Fi + Cellular
$399
scraper. It will yeild a nice json represenation of the data on the page. This will allow you to assert the data or save the results.javascript`
cy.get('#sale-items .product')
.scrapeElements({
elementsToScrape: [
{ label: 'product_name', locator: '.product-name' },
{ label: 'product_model', locator: '.model' },
{ label: 'item_price', locator: '.price' },
],
})
.then((scrapedData) => {
expect(scrapedData.data).to.deep.eq([
{
product_name: 'iPad Pro',
product_model: '11-inch Liquid Retina Display',
item_price: '$829.99',
},
{
product_name: 'iPad Air',
product_model: '64 GB Wi-Fi + Cellular',
item_price: '$599.99',
},
{
product_name: 'iPad mini',
product_model: '256 GB Wi-Fi + Cellular',
item_price: '$399',
},
]);
});Example - Web scraping records
javascript`
cy.get('#example')
.scrapeTable({
exportFileName: 'scrapedData.json',
exportFilePath: 'cypress/downloads',
})
.then((table) => {
expect(table.exportStatus).to.contain(
'Data table successfully saved'
);
});cypress/downloads
A json representation of the html table is then saved to a json file within the folder:Other Useful assertions
javascript
cy.get('#example')
.scrapeTable()
.then((table) => {
// assert the number of records in the table
expect(table.rowCount()).to.eq(3);
expect(
table.hasItem({
account_holder: 'Christian A. Lavalle',
})
).to.have.property('account_id', 'UA-10346-1');
expect(
table.hasItem({
account_holder: 'John Babs',
})
).to.be.undefined;
expect(table.columnLabels).to.deep.eq([
'Created',
'Account Id',
'Account Holder',
'Balance',
]);
expect(
table.isPropertySorted(['account_id'], ['desc']),
'account_id sorted in desc order'
).to.be.true;
// the example below return only 2 records where the account holder contains 'ver'
// it will find [Terrell E. Evert and James L. Silver]
expect(
table.containsItem(
'account_holder', 'ver')
).to.deep.eq([
{
created: '10-04-2021 13:40:17',
account_id: 'UA-11876-3',
account_holder: 'Terrell E. Evert',
balance: '$33',
},
{
created: '10-04-2021 12:00:17',
account_id: 'UA-10876-1',
account_holder: 'James L. Silver',
balance: '$50.5',
}
]);
// choose to only return certain properties you wish to validate against
let mappedProperties = table.getData().map((d) => {
return {
account_id: d.account_id,
balance: d.balance,
};
});
{
account_id: 'UA-11876-3',
balance: '$33',
},
{
account_id: 'UA-10876-1',
balance: '$50.5',
},
{
account_id: 'UA-10346-1',
balance: '$-22.98',
},
]);
});
`Date assertions
javascript
cy.get('#example')
.scrapeTable({ dateColumns: [0] })
.then((table) => {
// ensure the 'created' AKA 'Created' date field is sorted by ascending
expect(
table.isPropertySorted(['created'], ['asc']),
'created sorted in asc order'
).to.be.false;
});Use fixture as baseline
`javascript`
cy.get('#example')
.scrapeTable()
.then((table) => {
cy.fixture('expected_table_values').then((expectedTableData) => {
expect(table.getData()).to.deep.eq(expectedTableData);
});
});Infer data types and aggregate columns
`javascript`
cy.get('#example')
.scrapeTable({ decimalColumns: [3] })
.then((table) => {
expect(table.sumOfColumn('balance', 2)).to.eq(60.52);
});Configuration
|----------------------------|---------------|---------|---------|
| exportFileName | | string | The name of the exported file. Both the [exportFileName] and the [exportFilePath] properties must be specified in order to export the scrapted data to a file. |
| exportFilePath | | string | Where the exported files are to be saved. |
| includeTimestamp | false | boolean | Add a unique timestamp to the generated file |
| propertyNameConvention | snakeCase | string | Controls the naming convention of the resultant data table json representation. The value can either be 'snakeCase' or 'camleCase' |
| removeAllNewlineCharacters | false | boolean | Instructs the plugin to remove all new line characters from the table cell values |
| applyDataTypeConversion | false | boolean | Converts the column values to integers for the columns specified in the [decimalColumns] property. |
| decimalColumns | [] | array | This configuration is applied only when the applyDataTypeConversion flag is set to true. Expects an array of intergers, these numbers map to the column index (starting at zero) of each of columns in the table. The values contained in the columns will be converted to integers. |Best Practice
. javascript`
cy.intercept('api/endpoint/fetches/data').as('myTableData')
cy.visit('https://myapp.com')
cy.wait('@myTableData')
cy.get('#simpleTable')
.scrapeTable()
.then((table) => {}) assertion after your cy.get calljavascript``
cy.get('#simpleTable')
.should('have.length.above', 0)
.scrapeTable()
.then((table) => {})
Contributions
Credits
@metarmask
License
[mit]: https://opensource.org/licenses/MIT
[npm]: https://www.npmjs.com/