Skip to content

Per-firm fine-tuning

Per-firm fine-tuning

Train a custom model on your firm’s Vault so the Assistant and Workflows match your house style and vocabulary. Configure at /admin/legal/fine-tuning.

What it tunes

Style + voice, not facts. The Vault chunks become instruction-style JSONL pairs (~3 per chunk) — summarize-this / extract-clauses / rephrase-as. The fine-tuned model learns your firm’s register; the underlying facts still come from retrieval at inference time.

Backends

  • OpenAI — fully implemented. Uploads JSONL to OpenAI Files, starts a /v1/fine_tuning/jobs run, polls until ready.
  • Anthropic — scaffolded; no self-serve fine-tune API yet, so jobs go to failed: backend_not_supported_yet.
  • Bedrock, Azure OpenAI — same scaffolded state. Roadmap for when those providers’ fine-tune surfaces stabilize.

Credentials are pulled from the tenant’s existing AI Provider configuration — you don’t enter the OpenAI API key twice.

Starting a job

  1. New job → label, backend, base model (gpt-4o-mini-2024-07-18 is a good default).
  2. Pollen8 snapshots the Vault to JSONL (must have ≥10 ready chunks).
  3. Uploads to the provider, creates the fine-tune job, stores the provider_job_id.
  4. Poll status with Refresh until it reaches ready.

Activating

Once a job is ready, Activate flips is_active=true on that row (atomically deactivating any other). From that moment on, the Assistant + every workflow driver substitute the fine-tuned model id for the provider’s default — transparently.

The Why trace stamps fine_tuned: true on every LLM call that used the override, so you can audit attribution.

Going back to stock

Use stock model button bulk-deactivates all jobs for the tenant — Assistant + workflows resume using the provider default. Existing jobs aren’t deleted; they remain in the table for re-activation.

When not to fine-tune

If the Vault is small (under a few hundred chunks), the fine-tune will overfit. Stick with retrieval-only until the corpus matures. If house style is already captured by the base model, the marginal gain is small — measure A/B before promoting.