Reads a file to determine if it's genuine COPC, and provides details about any errors/warnings against the COPC and LAS specs
npm install copc-validator1. Introduction
1. Getting Started
2. Usage
1. CLI
1. Options
2. Import
1. Options
3. Scans
1. Quick scan
2. Full scan
3. Output
4. Details
1. Checks
1. Status & Check Objects
2. Functions
3. Suites
4. Parsers
5. Collections
6. All checks
2. Report
1. Report schema
5. Future Plans
_COPC Validator_ is a library & command-line application for validating the header and content of a Cloud-Optimized Point Cloud (COPC) LAS file. Extending the copc.js library, it accepts either a (relative) file path or COPC url, and runs a series of checks against the values parsed by copc.js.
1. Install from npm
npm i -g copc-validator
_Global install is recommended for CLI usage_
2. Scan copc.laz file with copcc CLI
Examples:
- Default
copcc ./path/to/example.copc.laz
- Deep scan, output to
copcc --deep path/to/example.copc.laz --output=output.json
- Deep & Minified scan with worker count = 64, showing a progress bar
copcc path/to/example.copc.laz -dmpw 64
_COPC Validator_ has two main usages: via the copcc Command-Line Interface (CLI), or imported as the generateReport() function
copcc [options]
_The usage and implementation of COPC Validator is meant to be as simple as possible. The CLI will only need one file path and will automatically run a shallow scan by default, or a deep scan if provided with the --deep option. All other functionality is completely optional._
| Option | Alias | Description | Type | Default |
| :--------- | :---: | ---------------------------------------------------------------------------------- | :-------: | :-------: |
| deep | d | Read all points of each node; Otherwise, read only root point | boolean | false |
| name | n | Replace name in Report with provided string | string | |
| mini | m | Omit Copc or Las from Report, leaving checks and scan info | boolean | false |
| pdal | P | Output a pdal.metadata object containing header & vlr data in pdal info format | boolean | false |
| workers | w | Number of Workers to create - _Use at own (performance) risk_ | number | CPU-count |
| queue | q | Queue size limit for reading PDR data. Useful for very high node counts (>10000) | number | Unlimited |
| sample | s | Select a random sample of nodes to read & validate | number | All nodes |
| progress | p | Show a progress bar while reading the point data | boolean | false |
| output | o | Writes the Report out to provided filepath; Otherwise, writes to stdout | string | N/A |
| help | h | Displays help information for the copcc command; Overwrites all other options | boolean | N/A |
| version | v | Displays copc-validator version (from package.json) | boolean | N/A |
1. Add to project:
``sh`
yarn add copc-validator
# or
npm i copc-validator
2. Import generateReport():
`TypeScript`
import { generateReport } from 'copc-validator'
- Example:
`TypeScript`
async function printReport() {
const report = await generateReport({
source: 'path/to/example.copc.laz',
options: {} // default options
})
console.log(report)
}
3. Copy laz-perf.wasm to /public _(for browser usage)_
generateReport accepts most\* of the same options as the CLI through the options property of the first parameter:
TypeScript:
`TypeScript`
const generateReport = ({
source: string | File
options?: {
name?: string //default: source | 'COPC Validator Report'
mini?: boolean //default: false
pdal?: boolean //default: false
deep?: boolean //default: false
workers?: number //default: CPU Thread Count
queueLimit?: number //default: Infinity
sampleSize?: number //default: All nodes
showProgress?: boolean //default: false
},
collections?: {copc, las, fallback}
})
_[See below]() for collections information_
> \* Key option differences:
>
> - No output, help, or version optionsqueue
> - is renamed to queueLimitsample
> - is renamed to sampleSizeprogress
> - is renamed to showProgress
> - Not usable in a browser
> - Any Alias (listed above) will not work
_COPC Validator_ comes with two scan types, shallow and deep
_(see requirements.md for more details)_
> The report output supports a custom scan type, intended to be used by other developers that may extend the base functionality of _COPC Validator_. It is not _currently_ used anywhere in this library.
The shallow scan checks the LAS Public Header Block and various Variable Length Records (VLRs) to ensure the values adhere to the COPC specificiations (found here)
This scan will also check the root (first) point of every node (in the COPC Hierarchy) to ensure those points are valid according to the contents of the Las Header and COPC Info VLR
The deep scan performs the same checks as a shallow scan, but scans every point of each node rather than just the root point, in order to validate the full contents of the Point Data Records (PDRs) against the COPC specs and Header info
_COPC Validator_ outputs a JSON report according to the Report Schema, intended to be translated into a more human-readable format (such as a PDF or Webpage summary)
A Check ultimately refers to the Object created by calling a Check.Function with performCheck(), which uses the Check.Suite property name to build the returned Check.Status into a complete Check.Check. This already feels like a bit much, without even mentioning Check.Parsers or Check.Collections, so we'll break it down piece-by-piece here
Pseudo-TypeScript:
`TypeScript
namespace Check {
type Status = {
status: 'pass' | 'fail' | 'warn'
description: string
info?: string
}
type Check = Status & { id: string }
type Function
| (c: T) => Status
| (c: T) => Promise
type Suite
type SuiteWithSource
type Parser
type Collection = (SuiteWithSource
}
type Check = Check.Check
`
_See ./src/types/check.ts for the actual TypeScript code_
A Check.Status Object contains a status property with a value of "pass", "fail", or "warn", and optionally contains an info property with a string value.
A Check Object is the same as a Status Object with an additional string property named id
_pass means file definitely matches COPC specificiations_ fail
_ means file does not match any COPC specifications_ warn
_ means file may not match current COPC specifications or recommendations_
Check.Functions maintain the following properties:
- Single (Object) parameter
- Syncronous or Asyncronous
- Output: Check.Status _(or a Promise)_
- Pure function
A Check.Suite is a map of string ids to Check.Functions, where each Function uses the same Object as its parameter (such as the Copc Object, for ./src/suites/copc.ts). The id of a Function becomes the id value of the Check Object when a Check.Suite invokes its Functions
_The purpose of this type of grouping is to limit the number of Getter calls for the same section of a file, like the 375 byte Header_
All Suites (with their Check.Functions) are located under src/suites
Check.Parsers are functions that take a source Object and return a Check.SuiteWithSource Object. Their main purpose is to parse a section of the given file into a usable object, and then return that object with its corrosponding Suite to be invoked from within a Collection.
All Parsers are located under src/parsers (ex: nodeParser)
#### nodes.ts
src/parsers/nodes.ts is unique among Parsers, in that it's actually running a Suite repeatedly as it parses. However, the data is not returned from the multithreaded Workers like a regular Check.Suite, so nodes.ts then gives the output data to the (_new_) pointDataSuite for sorting into Check.Statuses
worker.js
src/utils/worker.js essentially matches the structure of a Suite because it used to be the src/suites/point-data.ts Suite. To increase speed, the pointDataSuite became per-Node instead of per-File, which maximizes multi-threading, but creates quite a mess since worker.js must be (nearly) entirely self-contained for Worker/Web Worker threading. So src/suites/point-data.ts now parses the output of src/utils/worker.js, all of which is controlled by the src/parsers/nodes.ts Parser
Check.Collections are arrays of Check.Suites with their respective source Object (Check.SuiteWithSource above). They allow Promises in order to use Check.Parsers internally without having to await them.
All Collections are located under src/collections (ex: CopcCollection)
Replacing Collections is the primary way of generating custom reports through generateReport, as you can supply different Check.Suites to perform different Check.Functions per source object.
#### Custom scan
generateReport has functionality to build customized reports by overwriting the Check.Collections used within:
Pseudo-Type:
`TypeScript
import type {Copc, Getter, Las} from 'copc'
type Collections = {
copc: ({
filepath: string,
copc: Copc,
get: Getter,
deep: boolean,
workerCount?: number
}) => Promise
las: ({
get: Getter,
header: Las.Header,
vlrs: Las.Vlr[]
}) => Promise
fallback: (get: Getter) => Promise
}
const generateReport = async ({
source: string | File,
options?: {...},
collections?: Collections
}) => Promise
`
| ID | Description | Scan | Suite |
| :------------------------- | ----------------------------------------------------------------------- | :-----: | -------------- |
| minorVersion | copc.header.minorVersion is 4 | Shallow | Header |pointDataRecordFormat
| | copc.header.pointDataRecordFormat is 6, 7, or 8 | Shallow | Header |headerLength
| | copc.header.headerLength is 375 | Shallow | Header |pointCountByReturn
| | Sum of copc.header.pointCountByReturn equals copc.header.pointCount | Shallow | Header |legacyPointCount
| | header.legacyPointCount follows COPC/LAS specs | Shallow | manualHeader |legacyPointCountByReturn
| | header.legacyPointCountByReturn follows COPC/LAS specs | Shallow | manualHeader |vlrCount
| | Number of VLRs in copc.vlrs matches copc.header.vlrCount | Shallow | Vlr |evlrCount
| | Number of EVLRs in copc.vlrs matches copc.header.evlrCount | Shallow | Vlr |copc-info
| | Exactly 1 copc info VLR exists with size of 160 | Shallow | Vlr |copc-hierarchy
| | Exactly 1 copc hierarchy VLR exists | Shallow | Vlr |laszip-encoded
| | Checks for existance of LasZIP compression VLR, warns if not found | Shallow | Vlr |wkt
| | Ensures wkt string can initialize proj4 | Shallow | manualVlr |bounds within cube
| | Copc cube envelops Las bounds (min & max) | Shallow | Copc |rgb
| | RGB channels are used in PDR, if present | Shallow | PointData |rgbi
| | Checks for 16-bit scaling of RGBI values, warns if 8-bit | Shallow | PointData |xyz
| | Each point exists within Las and Copc bounds, per node | Shallow | PointData |gpsTime
| | Each point has GpsTime value within Las bounds | Shallow | PointData |sortedGpsTime
| | The points in each node are sorted by GpsTime value, warns if not | Deep | PointData |returnNumber
| | Each point has ReturnNumber <= NumberOfReturns | Shallow | PointData |zeroPoint
| | Warns with list of all pointCount: 0 nodes in the Hierarchy | Deep\* | PointData |nodesReachable
| | Every Node ('D-X-Y-Z') in the Hierarchy is reachable | Shallow | PointData |pointsReachable
| | Each Node pageOffset + pageLength leads into another Node page | Shallow | PointData |...ID
| | ...Description | Shallow | ... |
Checks and their IDs are subject to change as I see fit
See JSON Schema
TypeScript pseudo-type Report:
`TypeScript
import * as Copc from 'copc'
type Report = {
name: string
scan: {
type: 'shallow' | 'deep' | 'custom' | string //| 'shallow-X/N' | 'deep-X/N'
filetype: 'COPC' | 'LAS' | 'Unknown'
start: Date
end: Date
time: number
}
checks: ({
id: string
status: 'pass' | 'fail' | 'warn'
info?: string
})[]
// When scan.filetype === 'COPC'
copc?: {
header: Copc.Las.Header
vlrs: Copc.Las.Vlr[]
info: Copc.Info
wkt: string
eb: Copc.Las.ExtraBytes
}
// When scan.filetype === 'LAS'
las?: {
header: Copc.Las.Header
vlrs: Copc.Las.Vlr[]
}
error: {
message: string
stack?: string
}
// When scan.filetype === 'Unknown'
error: {
message: string
stack?: string
}
copcError?: {
message: string
stack?: string
} // only used if Copc.create() and Las.*.parse() fail for different reasons
}
`
- Add more Check.Functions - waiting on laz-perf chunk tableCheck.Collection` to validate LAS 1.4 specifications
- Rewrite LAS
- Continue to optimize for speed, especially large (1.5GB+) files