Skip to main content

Quickstart

Boot the gateway, connect a client, and run your first SQL query against a live TPC-H dataset.

Prerequisites

  • Docker Compose path (Option A): just Docker. The stack ships its own Postgres, so you need nothing else installed.
  • Jar path (Option B): JDK 21 or later (Arrow Flight requires Java 21+) and a Postgres 16 or later reachable at localhost:5432 (the default) for the control-plane schema and tenant catalogs.
  • Ports 20900 and 31338 free on the host (admin REST/UI on 20900, FlightSQL edge on 31338), for either path.

For alternative deployment paths (Kubernetes) and the full environment-variable model, see the Installation guide.

Boot the manager

Option A: Docker Compose (no prerequisites)

The quickest start: only Docker is required. From the root of a cloned repository:

./scripts/run-docker-compose.sh

This pulls the published starlakeai/quack-on-demand image plus a bundled postgres:16-alpine, brings the whole stack up, and waits for the manager to become ready - no local JDK or Postgres needed. Stop it with:

./scripts/stop-docker-compose.sh

Option B: From the jar

If you have JDK 21+ and a reachable Postgres:

./scripts/run-jar.sh

On first run the script downloads the latest release jar from Maven Central, probes Postgres, creates the control-plane database (qod), then starts the JVM. When Postgres is unreachable the script warns and aborts; start Postgres first or use the Docker Compose path above. Stop it with ./scripts/stop-jar.sh (SIGTERM, then SIGKILL after 10 seconds).

What comes up

Either path brings up the same surface:

  • Admin REST + UI on http://localhost:20900
  • Arrow FlightSQL edge on localhost:31338 (TLS on, self-signed cert auto-generated under certs/)
  • Two admin accounts seeded - admin and admin@localhost.local - both with password admin
  • Two bootstrap tenants seeded from src/main/resources/bootstrap-demo.yaml: acme (tenant-db acme_tpch, pools bi + etl) and globex (tenant-db globex_tpcds, pool bi). Idempotent on restart.

Boot flags

The same flags work on both run-docker-compose.sh and run-jar.sh, and they combine:

FlagEffect
LOAD_TPCH=NSeeds TPC-H sf=N into acme/acme_tpch (8 tables in schema tpch1). The jar path runs the loader on the host (DuckDB CLI + libduckdb are auto-installed by run-jar.sh on first boot, see Native run); the Compose path seeds inside the container. LOAD_TPCH=1 is ~6 M lineitem rows; SF=10 is ~60 M. Either this or LOAD_TPCDS being set also exports QOD_BOOTSTRAP_YAML so the JVM imports the bundled demo manifest.
LOAD_TPCDS=NSeeds TPC-DS sf=N into globex/globex_tpcds (24 tables in schema tpcds1). Slower than TPC-H at the same SF (SF=10 ≈ several minutes; SF=100+ spills to disk).
LOAD_TPC=NLegacy shortcut: equivalent to setting both LOAD_TPCH=N and LOAD_TPCDS=N. Explicit per-bench vars override it.
NUKE=1Tear down and wipe local state (Postgres data, parquet under ducklake/, certs/) before booting. Irreversible.
QOD_VERSION=latest-snapshotUse the latest snapshot image/jar instead of the latest release.
BUILD=1Build from local source (Compose: from the repo Dockerfile; jar: sbt assembly) instead of pulling/downloading.

For a clean, freshly seeded environment in one shot, combine them. This wipes any previous state and boots with both demo datasets at scale-factor 1:

NUKE=1 LOAD_TPCH=1 LOAD_TPCDS=1 ./scripts/run-docker-compose.sh

Either flag (or the legacy LOAD_TPC=1 shortcut) imports the bundled manifest under src/main/resources/bootstrap-demo.yaml, which declares the tenants, roles, groups, and users for both acme and globex; see the Access control model for the full ACL matrix.

Pick one benchmark to keep boot snappy:

NUKE=1 LOAD_TPCH=1 ./scripts/run-docker-compose.sh        # TPC-H only, ~10 s seed
NUKE=1 LOAD_TPCDS=10 ./scripts/run-docker-compose.sh # TPC-DS only, SF=10

The jar path takes the same flags:

NUKE=1 LOAD_TPCH=1 LOAD_TPCDS=1 ./scripts/run-jar.sh

Open the admin console

Browse to http://localhost:20900 and log in with:

  • Username: admin
  • Password: admin

The Tenants page shows the two bootstrap tenants, acme and globex. Opening either reveals the Databases, Pools, and Auth provider tabs. The Nodes page shows the live cluster dashboard and the recent-statements history.

Run your first query

DBeaver / JDBC

Install the Apache Arrow Flight SQL JDBC driver (org.apache.arrow:flight-sql-jdbc-driver, available on Maven Central). In DBeaver, create a new connection and paste this URL directly into the JDBC URL field:

jdbc:arrow-flight-sql://localhost:31338?useEncryption=true&disableCertificateVerification=true&user=admin&password=admin&tenant=acme&pool=bi

Set the driver class to org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver.

The disableCertificateVerification=true parameter is required because the gateway starts with an auto-generated self-signed certificate (see the TLS guide for how to supply a CA-signed cert and remove that flag).

The tenant=acme&pool=bi parameters are routing headers: the FlightSQL edge requires both to resolve which pool services the connection. Swap in tenant=globex&pool=bi to drive the TPC-DS demo instead.

Python (ADBC)

Install the driver:

pip install --user adbc_driver_flightsql adbc_driver_manager

Run the bundled load tester as a one-shot client (defaults to the TPC-H workload against acme's tpch1 schema):

python3 ./scripts/loadtest/loadtest.py \
--url grpc+tls://localhost:31338 --insecure \
--user admin --password admin --superuser \
--tenant acme --pool bi \
-w 1 -i 1 --warmup 0

--tenant and --pool are required (or set LT_TENANT / LT_POOL); the demo bootstrap creates tenants acme and globex, each with a pool named bi. The --insecure flag skips certificate verification for the auto-generated self-signed cert. The --superuser flag adds the superuser=true gRPC header so the bootstrap admin user (which lives in qodstate_user with tenant IS NULL) authenticates against the system realm; drop it when running as a tenant-scoped user.

To switch workload, pass --workload tpcds (or set LT_WORKLOAD=tpcds). The runner ships two curated benchmarks:

WorkloadDefault schemaDefault tenant/pool wiringTables touched
tpch (default)tpch1acme / bi (demo bootstrap)lineitem, customer, orders, nation, region, supplier, part
tpcdstpcds1globex / bi (demo bootstrap)the 24 TPC-DS tables seeded by scripts/load-tpcds-dbgen.sh

Each workload cycles a handful of representative queries (per-group aggregation, multi-way joins, top-N, window functions, date-range filters). The TPC-DS workload requires the target tenant-db to be seeded first:

SF=1 ./scripts/load-tpcds-dbgen.sh                                            # seeds globex_tpcds.tpcds1
python3 ./scripts/loadtest/loadtest.py --workload tpcds --tenant globex --pool bi \
--user admin --password admin --superuser --insecure -w 4 -i 50

Override the schema with --schema tpcds10 when you've seeded at a larger scale factor.

Example SQL

Once connected, run:

SELECT count(*) FROM tpch1.customer;

Tables live under the tpch1 schema inside the tpch tenant's database. The gateway auto-qualifies unqualified table names to the pool's default database and schema, so customer and tpch1.customer are equivalent once the session is scoped to the tpch/sales pool.

Next steps

  • Installation - native-jar setup, Docker Compose, Kubernetes, and environment variable reference.
  • Configuration reference - every QOD_* / PROXY_* environment variable with its default and description.
  • REST API reference - the interactive API explorer is linked in the top navigation of the admin UI.