Calculate complexity scores for SQL queries based on a weighting system for query components.
npm install sqomplexitytxt
_____ ____ _ _ _
/ ____| / __ \ | | (_)| |
| (___ | | | | ___ _ __ ___ _ __ | | ___ __ __ _ | |_ _ _
\___ \ | | | | / _ \ | '_ _ \ | '_ \ | | / _ \\ \/ /| || __|| | | |
SQompLexity is a metric that assigns a complexity score to SQL queries. It is specifically tailored to work with
MySQL queries, but other dialects of SQL will likely work as well. It needs no knowledge of the database schema and
quantifies each query in a vacuum.Installation
`shell
npm i sqomplexity
`Demo
https://bert-w.github.io/sqomplexity/Usage instructions
$3
`js
import { Sqomplexity } from 'sqomplexity';(async () => {
const sqomplexity = new Sqomplexity([
"SELECT * FROM users",
]);
console.log(
await sqomplexity.score()
);
// Result: [ 2.40625 ]
})();
`
See examples/node.js for a full example.$3
Use the precompiled dist/sqomplexity.umd.js file:
`html
`
See examples/browser.html for a full example.$3
Use the precompiled dist/sqomplexity.js containing all required code in a single file.Options:
`shell
node sqomplexity.js --helpArguments:
queries one or multiple SQL queries (space separated or quoted)
Options:
-V, --version output the version number
-f, --files assumes the given arguments/queries are filepaths, and it will read the contents from them.
Every file is expected to contain 1 query; if not, their complexity is summed
-b, --base64 assumes the given arguments/queries are base64 encoded
-s, --score output only the complexity score. -1 will be returned if an error occurs
-w, --weights takes a path to a json file that defines a custom set of weights
-a, --all returns all data including the AST
-p, --pretty-print output JSON with indentation and newlines (default: false)
-h, --help display help for command
``The scoring of an SQL query is based on 2 major components, being:
Data complexity (see prefix D in the table below), also called _Computational complexity_, which takes into account elements like the _amount of rows_
that a query operates on (relatively speaking), the _computation paths_ a query may take, and the usage of
_table indexes_ (_indices_). All of these determine the computational cost of a certain component.
Cognitive complexity (see prefix C in the table below), which describes the mental effort and the concepts a
person must understand in order to parse the query. This includes components like understanding of _First-order logic_,
understanding of _grouping_, _filtering_ and _sorting_ (common SQL concepts), and _Domain knowledge_
like the context of the query compared to its database schema.
What follows is the assignment of each of these indicators to components of an SQL query. The table below shows the
result of this process. The combination and presence of these indicators are combined into a final weighting for each
component, namely Low, Medium or High.
The numerical weights for each of groups are like so:
| Category | Numerical Score |
|----------------------|---------------------|
| Data Complexity | 50% |
| Cognitive Complexity | 50% |
| | |
| Low | 1.0 |
| Medium | 1.25 |
| High | 1.5 |
The equal contribution of both _Data Complexity_ and _Cognitive Complexity_ is arbitrary, and research could still be done
to develop a distribution that more fairly approaches a general sense of _complexity_.
Similarly, the weights of _Low_, _Medium_ and _High_ are set to some sensible defaults. It is necessary though for all
weights to be greater than or equal to 1, since multiplication may take place during the algorithm.