Atomic CSV data validation and error correction utilities
npm install @bernierllc/csv-validatorAtomic CSV data validation and error correction utilities for comprehensive data quality management.
- Comprehensive Field Validation: Support for string, number, integer, boolean, date, email, URL, phone, and custom field types
- Business Rule Validation: Custom validation logic for complex business requirements
- Error Correction: Automatic fixing of common data format issues
- Bulk Processing: Efficient validation of large datasets with parallel processing
- Detailed Error Reporting: Rich error messages with suggestions and confidence scores
- Schema Validation: Validate CSV schemas and field definitions
- Statistics and Analytics: Get validation statistics and error analysis
``bash`
npm install @bernierllc/csv-validator
`typescript
import { CSVValidator, ValidationSchema } from '@bernierllc/csv-validator';
// Define your validation schema
const schema: ValidationSchema = {
fields: [
{ name: 'name', type: 'string', required: true, minLength: 2, maxLength: 50 },
{ name: 'email', type: 'email', required: true },
{ name: 'age', type: 'integer', required: false },
{ name: 'active', type: 'boolean', required: false }
],
required: [0, 1] // name and email are required
};
// Create validator
const validator = new CSVValidator(schema);
// Validate a row
const row = ['John Doe', 'john@example.com', '30', 'true'];
const result = validator.validateRow(row);
console.log(result.isValid); // true
console.log(result.errors); // []
`
Main validator class for single row validation.
#### Constructor
`typescript`
new CSVValidator(schema: ValidationSchema, businessRules?: BusinessRule[])
#### Methods
##### validateRow(row: string[], lineNumber?: number): ValidationResult
Validates a single row of CSV data.
`typescript`
const result = validator.validateRow(['John Doe', 'john@example.com', '30']);
##### static validateRow(row: string[], schema: ValidationSchema, businessRules?: BusinessRule[], lineNumber?: number): ValidationResult
Static convenience method for one-off validation.
`typescript`
const result = CSVValidator.validateRow(row, schema, businessRules);
Bulk validator for processing multiple rows with statistics.
#### Constructor
`typescript`
new BulkCSVValidator(schema: ValidationSchema, businessRules?: BusinessRule[])
#### Methods
##### validateRows(rows: string[][]): BulkValidationResult
Validates multiple rows and returns comprehensive results.
`typescript
const bulkValidator = new BulkCSVValidator(schema);
const result = bulkValidator.validateRows([
['John Doe', 'john@example.com', '30'],
['Jane Smith', 'jane@example.com', '25'],
['Bob Johnson', 'invalid-email', '150']
]);
console.log(result.totalRows); // 3
console.log(result.validRows); // 2
console.log(result.invalidRows); // 1
console.log(result.totalErrors); // 1
`
##### getValidationStats(rows: string[][]): ValidationStats
Get detailed validation statistics.
`typescript`
const stats = bulkValidator.getValidationStats(rows);
console.log(stats.errorRate); // 0.33
console.log(stats.averageErrorsPerRow); // 0.33
##### getMostCommonErrors(rows: string[][]): Array<{code: string, count: number, message: string}>
Get the most common validation errors.
`typescript`
const commonErrors = bulkValidator.getMostCommonErrors(rows);
// [{ code: 'INVALID_EMAIL', count: 5, message: 'Field must be a valid email address' }]
##### validateRowsParallel(rows: string[][], batchSize?: number): Promise
Validate rows in parallel for large datasets.
`typescript`
const result = await bulkValidator.validateRowsParallel(rows, 1000);
##### static validateRows(rows: string[][], schema: ValidationSchema, businessRules?: BusinessRule[]): BulkValidationResult
Static convenience method for bulk validation.
`typescript`
const result = BulkCSVValidator.validateRows(rows, schema, businessRules);
Automatic error correction for common data format issues.
#### Constructor
`typescript`
new CSVErrorFixer(schema: ValidationSchema, businessRules?: BusinessRule[])
#### Methods
##### fixRow(row: string[], errors: ValidationError[]): FixedRow
Attempts to fix validation errors in a row.
`typescript
const fixer = new CSVErrorFixer(schema);
const fixedRow = fixer.fixRow(row, errors);
console.log(fixedRow.hasChanges); // true
console.log(fixedRow.confidence); // 0.8
console.log(fixedRow.fixes); // Array of applied fixes
`
##### static fixRow(row: string[], errors: ValidationError[], schema: ValidationSchema, businessRules?: BusinessRule[]): FixedRow
Static convenience method for error fixing.
`typescript`
const fixedRow = CSVErrorFixer.fixRow(row, errors, schema);
typescript
{ name: 'name', type: 'string', required: true, minLength: 2, maxLength: 50 }
`$3
`typescript
{ name: 'price', type: 'number', required: true }
{ name: 'amount', type: 'float', required: true }
`$3
`typescript
{ name: 'age', type: 'integer', required: false }
`$3
`typescript
{ name: 'active', type: 'boolean', required: false }
// Accepts: 'true', 'false', '1', '0', 'yes', 'no'
`$3
`typescript
{ name: 'birthDate', type: 'date', required: false }
// Accepts ISO date format: '1990-01-01'
`$3
`typescript
{ name: 'email', type: 'email', required: true }
`$3
`typescript
{ name: 'website', type: 'url', required: false }
`$3
`typescript
{ name: 'phone', type: 'phone', required: false }
`$3
`typescript
{ name: 'custom', type: 'custom', custom: (value) => value.startsWith('ABC') }
`Business Rules
Define custom validation logic that applies to entire rows.
`typescript
const businessRules: BusinessRule[] = [
{
name: 'email_domain_check',
condition: (row) => {
const email = row[1]; // email field
return email.includes('@company.com');
},
message: 'Email must be from company.com domain',
severity: 'warning'
},
{
name: 'age_salary_validation',
condition: (row) => {
const age = parseInt(row[2]); // age field
const salary = parseInt(row[3]); // salary field
return age < 18 ? salary < 50000 : true;
},
message: 'Minors cannot have salary above $50,000',
severity: 'error'
}
];
`Row Constraints
Define constraints that apply to entire rows.
`typescript
const schema: ValidationSchema = {
fields: [...],
constraints: [
{
name: 'age_range',
condition: (row) => {
const age = parseInt(row[2]);
return age >= 0 && age <= 120;
},
message: 'Age must be between 0 and 120',
severity: 'error'
}
]
};
`Error Correction
The error fixer can automatically correct common data format issues:
- Numbers: Remove non-numeric characters
- Integers: Remove decimal parts
- Booleans: Convert various formats to standard true/false
- Dates: Convert common formats to ISO format
- Emails: Clean and normalize email addresses
- URLs: Add missing protocols
- Phone Numbers: Remove formatting characters
- Enums: Fix case sensitivity and find partial matches
`typescript
const fixer = new CSVErrorFixer(schema);
const errors = validator.validateRow(row).errors;
const fixedRow = fixer.fixRow(row, errors);if (fixedRow.hasChanges) {
console.log('Applied fixes:', fixedRow.fixes);
console.log('Confidence:', fixedRow.confidence);
}
`Validation Results
$3
`typescript
interface ValidationResult {
isValid: boolean;
errors: ValidationError[];
suggestions: Suggestion[];
warnings: ValidationWarning[];
totalErrors: number;
totalWarnings: number;
totalSuggestions: number;
}
`$3
`typescript
interface ValidationError {
field: string;
index: number;
value: string;
message: string;
code: string;
severity: 'error' | 'warning';
suggestion?: string;
lineNumber?: number;
columnNumber?: number;
}
`$3
`typescript
interface Suggestion {
field: string;
index: number;
originalValue: string;
suggestedValue: string;
confidence: number;
reason: string;
}
`Examples
$3
`typescript
import { CSVValidator } from '@bernierllc/csv-validator';const schema = {
fields: [
{ name: 'name', type: 'string', required: true },
{ name: 'email', type: 'email', required: true },
{ name: 'age', type: 'integer', required: false }
]
};
const validator = new CSVValidator(schema);
const result = validator.validateRow(['John Doe', 'john@example.com', '30']);
if (!result.isValid) {
console.log('Validation errors:', result.errors);
console.log('Suggestions:', result.suggestions);
}
`$3
`typescript
import { BulkCSVValidator } from '@bernierllc/csv-validator';const bulkValidator = new BulkCSVValidator(schema);
const result = bulkValidator.validateRows(rows);
console.log(
Valid rows: ${result.validRows}/${result.totalRows});
console.log(Error rate: ${result.summary.errorRate});
console.log('Most common errors:', result.summary.mostCommonErrors);
`$3
`typescript
import { CSVErrorFixer } from '@bernierllc/csv-validator';const fixer = new CSVErrorFixer(schema);
const validator = new CSVValidator(schema);
const row = ['John Doe', 'john at example.com', 'abc'];
const validationResult = validator.validateRow(row);
if (!validationResult.isValid) {
const fixedRow = fixer.fixRow(row, validationResult.errors);
if (fixedRow.hasChanges) {
console.log('Fixed row:', fixedRow.row);
console.log('Applied fixes:', fixedRow.fixes);
}
}
`$3
`typescript
const businessRules = [
{
name: 'senior_discount',
condition: (row) => {
const age = parseInt(row[2]);
const discount = parseFloat(row[4]);
return age >= 65 ? discount <= 0.25 : true;
},
message: 'Senior discount cannot exceed 25%',
severity: 'error'
}
];const validator = new CSVValidator(schema, businessRules);
`Performance
- Single Row Validation: ~0.1ms per row
- Bulk Validation: ~1000 rows/second
- Parallel Processing: ~5000 rows/second (with batching)
- Memory Usage: ~1MB per 10,000 rows
Error Codes
| Code | Description |
|------|-------------|
|
REQUIRED_FIELD | Required field is missing or empty |
| INVALID_NUMBER | Field is not a valid number |
| INVALID_INTEGER | Field is not a valid integer |
| INVALID_BOOLEAN | Field is not a valid boolean |
| INVALID_DATE | Field is not a valid date |
| INVALID_EMAIL | Field is not a valid email address |
| INVALID_URL | Field is not a valid URL |
| INVALID_PHONE | Field is not a valid phone number |
| MIN_LENGTH | Field is shorter than minimum length |
| MAX_LENGTH | Field is longer than maximum length |
| PATTERN_MISMATCH | Field does not match required pattern |
| INVALID_ENUM | Field value is not in allowed enum values |
| BUSINESS_RULE_VIOLATION | Business rule validation failed |
| ROW_CONSTRAINT_VIOLATION | Row constraint validation failed |
| CUSTOM_VALIDATION` | Custom validation function failed |Bernier LLC - All rights reserved.
This package is licensed to the client under a limited-use license.
The client may use and modify this code only within the scope of the project it was delivered for.
Redistribution or use in other products or commercial offerings is not permitted without written consent from Bernier LLC.