High-performance WebAssembly library for Apache Arrow, Feather, and Parquet data with zero-copy semantics, LZ4 compression, and comprehensive model-based testing
npm install arrow-rs-wasmHigh-performance WebAssembly bindings for Apache Arrow that expose zero-copy columnar data—built in Rust, delivered to JavaScript/TypeScript with camelCase APIs.
- JS-friendly surface – Functions and classes are exported with camelCase/PascalCase via #[wasm_bindgen(js_name = ...)].
- Zero-copy buffers – Access Arrow vectors through live TypedArray views backed directly by Wasm linear memory.
- UTF-8 columns – Retrieve values/offsets/validity buffers and lazily decode strings only when needed.
- Compute helpers – Optional table transforms such as filterTable, takeRows, and sortTable return new handles on the Wasm side.
- Dual runtime support – Tested with modern Vite/React browser pipelines and Node.js ESM projects.
``bash`
npm i /Users/ods/Documents/arrow-rs-wasm/pkg
`bash`
npm i arrow-rs-wasm
> If consuming from source, regenerate the package with:
>
> `bash`
> # Browser ESM bundle
> wasm-pack build --release --target web
>
> # Node.js bundle
> wasm-pack build --release --target nodejs
>
The package ships as an ESM module. Always call the default export (await init()) once before using any other function.
1. Install the local build:
`bash`
npm i /Users/ods/Documents/arrow-rs-wasm/pkg
2. Seed a test handle (for example, in src/test-setup.ts):
`ts
import init, { createTestTable } from 'arrow-rs-wasm';
void (async () => {
await init();
(window as any).TEST_HANDLE = createTestTable();
})();
`
3. Render the app.
`tsx
// src/App.tsx
import { useEffect, useState } from 'react'
import init, { getColumnNames, exportUtf8Buffers, exportPrimitiveBuffers } from 'arrow-rs-wasm'
export default function App() {
const [ready, setReady] = useState(false)
const [cols, setCols] = useState
useEffect(() => {
(async () => {
await init(); // Required wasm-bindgen init
const handle = (window as any).TEST_HANDLE; // Supply a real handle in your boot script
const names = await getColumnNames(handle);
setCols(names);
setReady(true);
// Example: peek at buffers
const primitives = await exportPrimitiveBuffers(handle, 'id');
console.log('Primitive values buffer', primitives.values);
const utf = await exportUtf8Buffers(handle, 'name');
console.log('UTF-8 buffer lengths', utf.values.length, utf.offsets.length);
})();
}, []);
return ( Ready: {String(ready)}
arrow-rs-wasm (Vite/React)
{JSON.stringify(cols, null, 2)}
);
}
`
`tsx
// src/main.tsx
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import './test-setup' // registers TEST_HANDLE, etc.
import App from './App'
createRoot(document.getElementById('root')!).render(
)
`
Build a node-compatible bundle (wasm-pack build --release --target nodejs) if you are consuming locally.
`js
// node-example.mjs
import init, { createTestTable, getColumnNames } from 'arrow-rs-wasm';
await init(); // Loads the Node.js target Wasm
const handle = createTestTable(); // or hydrate from Arrow IPC/parquet bytes
const columns = await getColumnNames(handle);
console.log('Columns:', columns);
`
Run with:
`bash`
node node-example.mjs
| Export | Description |
| --- | --- |
| init(options?) | Default async initializer (must be awaited once). |initWithOptions(enableConsoleLogs: boolean)
| | Optional second-stage setup for debugging. |setPanicHook()
| | Routes Rust panics to console (no-op unless enabled). |createTestTable()
| | Returns a demo table handle for quick experiments. |readTableFromBytes(data: Uint8Array)
| | Loads Arrow IPC bytes into Wasm, returns a table handle. |writeTableToIpc(handle, enableLz4)
| | Serializes a table handle back to IPC bytes. |getColumnNames(handle)
| | Resolves string[] of column names. |exportPrimitiveBuffers(handle, columnName)
| | Returns { values: TypedArray; validity?: Uint8Array | null }. |exportUtf8Buffers(handle, columnName)
| | Returns { values: Uint8Array; offsets: Int32Array | BigInt64Array; validity?: Uint8Array | null }. |exportBinaryBuffers(handle, columnName)
| | Returns { values: Uint8Array; offsets?: Int32Array; validity?: Uint8Array | null }. |filterTable(handle, predicateSpec)
| | Applies predicate, yields new table handle. |takeRows(handle, indices)
| | Selects row subset. |sortTable(handle, sortKeys)
| | Returns sorted table handle. |getTableInfo(handle)
| | Summaries about row/column counts and schema. |freeTable(handle)
| | Releases Wasm-side resources. |getMemoryInfo()
| | Debug helper describing Wasm memory usage. |
> Internally, Rust keeps snake_case identifiers; the exported JS API uses camelCase/PascalCase thanks to #[wasm_bindgen(js_name = ...)].
Every buffer exporter returns a view into Wasm linear memory—no copies are made. Treat these objects as live slices:
- exportPrimitiveBuffers → TypedArray (e.g., Int32Array, Float64Array) plus optional Uint8Array validity bitmap.exportUtf8Buffers
- → values (Uint8Array of concatenated UTF-8), offsets (Int32Array or BigInt64Array depending on column width), and optional validity.
Lazy decode UTF-8 values only when needed:
`ts`
const utf = await exportUtf8Buffers(handle, 'name');
const decoder = new TextDecoder();
const i = 0;
const start = utf.offsets[i];
const end = utf.offsets[i + 1];
const firstValue = decoder.decode(utf.values.subarray(start, end));
Wasm memory may grow during allocations, producing a new backing ArrayBuffer. Existing views detach and report length 0. After any heavy operation (e.g., filterTable, sortTable, or bulk append), regenerate views:
`ts`
let { values } = await exportPrimitiveBuffers(handle, 'score');
// ... after operations that may allocate
({ values } = await exportPrimitiveBuffers(handle, 'score')); // refresh view
Long-lived UIs should re-request buffers whenever a major action completes.
`ts
export type TableHandle = number;
export interface PrimitiveBuffers {
values: Int8Array | Int16Array | Int32Array | Float32Array | Float64Array;
validity?: Uint8Array | null;
}
export interface Utf8Buffers {
values: Uint8Array;
offsets: Int32Array | BigInt64Array; // LargeUtf8 uses 64-bit offsets
validity?: Uint8Array | null;
}
`
Type definitions ship in pkg/arrow_rs_wasm.d.ts. Ensure esModuleInterop or native ESM pipeline for consumers.
`ts`
const { values, validity } = await exportPrimitiveBuffers(handle, 'id');
const dataView = new DataView(values.buffer, values.byteOffset, values.byteLength);
const first = dataView.getInt32(0, true);
const isValid = !validity || (validity[0] & 1) === 1;
`ts
const utf = await exportUtf8Buffers(handle, 'name');
const decoder = new TextDecoder();
for (let i = 0; i < utf.offsets.length - 1; i++) {
const isNull = utf.validity && (utf.validity[Math.floor(i / 8)] & (1 << (i % 8))) === 0;
if (isNull) continue;
const start = Number(utf.offsets[i]);
const end = Number(utf.offsets[i + 1]);
console.log(decoder.decode(utf.values.subarray(start, end)));
}
`
- Favor filterTable, takeRows, and sortTable (where available) to keep computations inside Wasm and reduce host copies.
- Avoid eagerly decoding UTF-8 strings; decode on demand at the UI boundary.
- Reuse handles and refresh views after operations that might trigger Wasm memory growth.
- /Users/ods/Documents/arrow-rs-wasm/pkg contains the ESM wrapper (arrow_rs_wasm.js), the compiled Wasm artifact, type definitions, and package.json.wasm-pack build --release --target web
- Rebuild with (browser) or --target nodejs (Node).
- Install into client projects with:
`bash`
npm i /Users/ods/Documents/arrow-rs-wasm/pkg
- Browser (Vite): npm run dev, open http://localhost:5173, ensure await init() runs, verify column names, decode a UTF-8 element, and confirm zero-copy buffers by comparing .buffer to a cached reference.buffer
- Chromium DevTools MCP: Automate via a DevTools protocol session—assert camelCase exports, zero-copy equality, and lazy decode results.
- Memory detach test: Execute an operation that grows memory (e.g., heavy filter), then re-run buffer exporters to rebuild views.
- Module not found: Use the appropriate bundle (--target web for browsers, --target nodejs for Node).#[wasm_bindgen(js_name = ...)]`, then rebuild the pkg directory.
- Empty buffers: Likely due to Wasm memory growth; call the exporter again.
- Snake_case exports: Ensure the Rust functions use
- Semantic versioning: MAJOR.MINOR.PATCH.
- Dual-licensed under MIT and Apache-2.0—use either license at your option.