`RQL` (Ruhis Query Language) is a powerful library designed to simplify the process of filtering, sorting, and aggregating large amounts of data. With RQL, you can effortlessly extract valuable insights from complex datasets, making data analysis and mani
RQL (Ruhis Query Language) is a powerful library designed to simplify the process of filtering, sorting, and aggregating large amounts of data. With RQL, you can effortlessly extract valuable insights from complex datasets, making data analysis and manipulation tasks more efficient. RQL was initially developed for an internal SIEM project, so it is well suited for security-related use cases, but it can be used for any type of data.
- Simple and intuitive syntax - RQL is designed to be easy to learn and use. The syntax is similar to KQL or XQL, but with a few key differences that make it more intuitive and powerful.
- Light, type-safe and developer friendly - RQL is written in TypeScript and compiled to JavaScript. It is very lightweight and has a great documentation, making it easy to integrate into any project.
- Thoroughly tested - RQL has a comprehensive test suite with a very code coverage, ensuring that it works as expected in all scenarios.
Quick Start Guide
1. Install via your preferred package manager:
- npm install @ruhisfi/rql - yarn add @ruhisfi/rql 2. Import QueryParser and QueryExecutor to your code:
``js
import { QueryParser, QueryExecutor } from "@ruhisfi/rql";
`
3. Parse query and execute it against a dataset:
`js
const query =
'dataset = example_data | filter name = "John" or country = "Finland" | fields name, country, city, email, age | sort age desc | limit 10';
const parsedQuery = QueryParser.parseQuery(query); // This will validate the query and convert it into a JS object
const result = QueryExecutor.executeQuery(parsedQuery, data); // This will execute the query against the dataset
`
Syntax Guide
The query consists of multiple statements separated by the pipe (|) character. The statements are case-sensitive, and must be written in lowercase. The query lines can be commented out with #. The statements are executed in the order they are written in the query.
Operators
The following operators are supported in RQL:
| Operator | Description |
| ------------ | ----------------------------------------------------------------------- |
| =, != | Equal, Not equal |
| >, < | Greater than, Less than |
| >=, <= | Greater than or equal, Less than or equal |
| and | Boolean AND |
| or | Boolean OR |
| contains | Returns true if the specified value is contained in string or array |
| not contains | Returns true if the specified value is not contained in string or array |
| matches, ~= | Returns true if the regex pattern matches |
| incidr | Returns true if the IP address is in the CIDR range |
| not incidr | Returns true if the IP address is not in the CIDR range |
| in | Returns true if the value is in the specified list |
| not in | Returns true if the value is not in the specified list |
Statements
alter
$3
alter =
$3
The alter statement is used to create new or overwrite existing fields in the dataset using a value functions like addition, subtraction, letter casing, etc. The alter statement can be used multiple times in a query and the fields created by it can be used in other statements.
The comp statement is used to calculate statistics for results. This function will override other returned records. If used multiple times, the statistics will be merged on one row.
$3
| Function | Description |
| -------------- | ------------------------------------------------------------- |
| avg | Returns the average value of the field |
| count | Returns the number of records where field is not null |
| count_distinct | Returns the number of distinct values where field is not null |
| earliest | Returns the earliest timestamp |
| first | Returns the first value |
| last | Returns the last value |
| latest | Returns the latest timestamp |
| max | Returns the maximum value |
| median | Returns the median value |
| min | Returns the minimum value |
| sum | Returns the sum of values |
| to_array | Returns an array of values |
$3
` // Returns the total number of users, the number of distinct users and the first login time in the USA
dataset = logins
| filter country = "USA"
| comp count username as totalUsers, count_distinct username as distinctUsers, earliest _time as firstLogin
// Returns the amount of logins per country
dataset = logins
| comp count correlationId as logins by country
`