Skip to content

Safety & validation

Safety & validation

DB Engine is read-only by contract. The validator layer enforces it — even SQL hand-edited in the workbench passes through the same gate as planner-generated SQL.

Five rejection rules

RuleWhy
SELECT-onlyDROP / DELETE / UPDATE / INSERT / TRUNCATE / GRANT / REVOKE / ALTER all rejected at parse time
Single-statementStacked ;-separated statements rejected — no smuggling DML behind a SELECT
Single-connectionThree-part names referencing another connection rejected
No pg_* / information_schema writesEven on Postgres, system-catalog writes blocked
Row-cap enforcedLIMIT auto-injected if missing

Scope predicates

Per-connection WHERE-fragment list. The validator AND-injects them into every executed query. Format:

[
{ table: "encounters", predicate: "tenant_id = '...'" },
{ table: "patients", predicate: "tenant_id = '...'" }
]

The planner is told about scope predicates in its system prompt so it doesn’t double-apply them, but if it omits them, the validator re-applies them. They can’t be bypassed.

EXPLAIN preflight

Before execution, the driver runs the equivalent of:

  • Postgres / MySQL: EXPLAIN (FORMAT JSON) <user_sql>.
  • BigQuery: dry-run with billing estimate.
  • Snowflake: EXPLAIN <user_sql>.
  • DuckDB: EXPLAIN <user_sql>.

Estimated row scan above db.explain.max_rows → block. Estimated cost (BigQuery $$) above db.explain.max_cost_usd → block. Both limits are tenant-configurable.

Statement timeout

Driver-level statement timeout: 30 seconds. A long-running query gets cancelled. Adjust per connection via config.statement_timeout_ms if you need a different ceiling (some analytical workloads do).

What an adversarial user can NOT do

  • Execute DML or DDL.
  • Read another tenant’s data even if shared connection — scope predicates AND’d in.
  • Run a query that scans more than max_rows (configurable).
  • Run a query for longer than the statement timeout.
  • Bypass the validator by hand-editing — workbench-edited SQL re- enters the same validator.

What the validator does NOT prevent

  • A poorly-permissioned connection. If the DB user the connection uses has more privileges than you intended, that’s a connection- config problem. Pollen8’s best practice: connect with a least-privilege read-only user.
  • Information disclosure from queries that should have been scope-predicated but weren’t. Make sure your scope predicates cover every table you don’t want exposed.

Audit

Every execute call stamps a Why trace with:

  • Connection id.
  • Original NL question (when planner-routed).
  • Planner SQL + edited SQL (when different).
  • EXPLAIN cost estimate.
  • Execution time + row count.
  • User id + AuthContext token id.

Queryable via the audit table for any SOC2 / HIPAA accounting need.