Cube.js pre-aggregation storage layer.
npm install @cubejs-backend/cubestoreWebsite • Docs •
Examples • Blog •
Slack • Twitter



Cube.js pre-aggregation storage layer.
Over the past year, we've accumulated feedback around various use-cases with
pre-aggregations and how to store them. We've learned that there are a set of
problems where relational databases as a storage layer has significant
performance and functionality issues.
These problems include:
- Performance issues with high cardinality rollups (1B and more)
- Lack of HyperLogLog support
- Degraded performance for big UNION ALL queries
- Poor JOIN performance across rolled up tables
- Table/schema name length issues across different database types
- SQL type differences between source and external database
Over time, we realized that if we try to fix these issues with existing database
engines, we'd end up modifying these databases' codebases in one way or another.
We decided to take another approach and write our own materialized OLAP cache
store, designed solely to store and serve rollup tables at scale.
To optimize performance as much as possible, we went with a native approach and
are using Rust to develop Cube Store, utilizing a set of technologies like
RocksDB, Apache Parquet, and Arrow that have proven effectiveness in solving
data access problems.
Cube Store is fully open-sourced and released under the Apache 2.0 license.
We intend to start distributing Cube Store with Cube.js, and eventually make
Cube Store the default pre-aggregation storage layer for Cube.js. Support for
MySQL and Postgres as external databases will continue, but at a lower priority.
We'll also update all documentation regarding pre-aggregations and include usage
and deployment instructions for Cube Store.
> If your platform/architecture is not supported, you can launch Cube Store
> using Docker.
| | linux-gnu | linux-musl | darwin | win32 |
| -------- | :---------: | :----------: | :------: | :-----: |
| x86 | N/A | N/A | N/A | N/A |
| x86_64 | ✅ | ✅ | ✅ | ✅ |
| arm64 | ✅ | | ✅[1] | |
[1] It can be launched using Rosetta 2 via the x86_64-apple binary.
Starting with v0.26.48, Cube.js ships with Cube Store enabled when CUBEJS_DEV_MODE=true.
You don't need to set up any CUBEJS_EXT_DB_* environment variables orexternalDriverFactory inside your cube.js configuration file.
For versions prior to v0.26.48, you should upgrade your project to the latest
version and install the Cube Store driver:
``bash`
yarn add @cubejs-backend/cubestore-driver
After starting up, Cube.js will print a message:
🔥 Cube Store (0.26.64) is assigned to 3030 port.
Start Cube Store in a Docker container and bind port 3030 to 127.0.0.1:
`bash`
docker run -d -p 3030:3030 cubejs/cubestore:edge
Configure Cube.js to use the above connection for an external database via the
.env file:
`dotenv`
CUBEJS_EXT_DB_TYPE=cubestore
CUBEJS_EXT_DB_HOST=127.0.0.1
Create a docker-compose.yml file with the following content:
`yml
version: '2.2'
services:
cubestore:
image: cubejs/cubestore:edge
cube:
image: cubejs/cube:latest
ports:
- 4000:4000 # Cube.js API and Developer Playground
- 3000:3000 # Dashboard app, if created
env_file: .env
depends_on:
- cubestore
links:
- cubestore
volumes:
- ./schema:/cube/conf/schema
`
Configure Cube.js to use the above connection for an external database via the
.env file:
`dotenv`
CUBEJS_EXT_DB_TYPE=cubestore
CUBEJS_EXT_DB_HOST=cubestore
`bash`
docker build -t cubejs/cubestore:latest .
docker run --rm cubejs/cubestore:latest
Debian prerequisites (incomplete): apt-get install lld libssl-dev pkg-config cmake
When changing Datafusion or Arrow:
Check out https://github.com/cube-js/arrow-rs/tree/cube and
https://github.com/cube-js/arrow-datafusion/tree/cube and add the
following to the current directory's Cargo.toml. (But remember to
exclude this from your PR!)
`
[patch.'https://github.com/cube-js/arrow-rs']
parquet = { path = "../../../arrow-rs/parquet" }
arrow = { path = "../../../arrow-rs/arrow" }
[patch.'https://github.com/cube-js/arrow-datafusion']
datafusion = { path = "../../../arrow-datafusion/datafusion" }
`
Of course, you can use absolute paths or adjust the paths to your
chosen checkout location.
It is possible that uncommenting the arrow-datafusion
.cargo/config.toml` path line works for you too, but it might not, if
you are making changes in arrow-rs.
Cube Store is Apache 2.0 licensed.