Cost controls

Bound your LLM spend with an opt-in daily cap and a per-thread deep-dive cap, see how caps gate work, and monitor spend in Diagnostics.

Forven's AI work runs on your own model key (bring-your-own-key). That means every agent task and every deep-dive turn spends real money against your provider account, not Forven's. Cost controls are the brakes: an opt-in daily spend cap for background agent work, a separate per-thread cap for deep-dive chats, and a running cost figure persisted on every call so you can see where the spend went.

This page is for operators who want a budget ceiling. It covers what each cap is, its default, how a cap blocks new work when reached, and where to watch spend. The AI agent layer is a Forged-tier feature; the caps below only matter once you are running agents or deep-dive on your own key.

Forven never resells model tokens and never sees your key. Spend is between you and your provider. The numbers in this page are illustrative defaults from the app, not a quote.

The two caps

There are two independent ceilings. They do not share a budget and they govern different kinds of work.

CapConfig keyDefaultScopeWhat it gates
Daily agent capagent_daily_cost_cap_usd0 (disabled)All agent tasks created todayNew agent task launches
Deep-dive per-thread capdeepdive.cost_cap_usd$5One deep-dive threadThe next turn in that thread

The daily cap is opt-in — it ships at 0, which means disabled (no ceiling). You set it deliberately. The deep-dive cap is on by default at $5 per thread, because a single interactive chat is the easiest place to accidentally loop.

Both caps are conservative by design. They exist to stop a runaway loop, not to ration normal use.

The daily agent cap

agent_daily_cost_cap_usd is a single number that ceilings the total LLM spend across all agent tasks launched in one day.

How it works:

  • Every agent task persists a cost_usd value when it runs — estimated from the provider/model token rates and the call's input/output token usage.
  • Before a new task is allowed to launch, Forven checks the cap: get_spend_today() sums cost_usd across all agent_tasks with created_at on or after the start of today (this is simulation-clock aware, so it respects Forven's internal clock).
  • If the day's spend has already reached the cap and the cap is greater than 0, the launch is refused with the reason "Daily cap reached".
  • A blocked task is not lost. It stays queued but not launched until either the next day rolls over or you raise the cap.
allowed, reason = check_daily_cost_cap()
  → get_daily_cost_cap()   reads settings.agent_daily_cost_cap_usd  (0 = disabled)
  → get_spend_today()      sums cost_usd on today's agent_tasks
  → if spent >= cap and cap > 0:  (allowed=False, reason="Daily cap reached")

Because 0 means disabled, leaving the cap at its default imposes no ceiling at all. If you want a budget, set a positive number.

What the cap does and does not cover

  • Covered: background agent tasks — the work the brain queues and dispatches to agents (code changes, post-mortems, optimization runs, and so on).
  • Not covered by this cap: deep-dive threads. Those have their own per-thread ceiling (below). A deep-dive turn is not counted against agent_daily_cost_cap_usd.

This separation is deliberate. The daily cap protects against autonomous, unattended work piling up spend overnight; the deep-dive cap protects an interactive session you are actively driving.

The deep-dive per-thread cap

deepdive.cost_cap_usd ceilings a single deep-dive thread, defaulting to $5.

  • Each turn's cost is estimated and added to the thread's running total.
  • When the thread's accumulated cost reaches the cap, the next turn is rejected with a cost-cap error — the send endpoint returns an error status instead of streaming a reply.
  • The cap is per thread, so a fresh thread starts again at zero. Archiving and reopening work does not reset a thread's accumulated cost — the thread carries its total for the record.

Inside a single turn there is a second, structural limit worth knowing: the model can loop at most MAX_TOOL_ROUNDS = 8 tool rounds (read code, edit, re-run backtest) before it must produce a final answer. That bounds the work — and therefore the cost — of any one turn before the dollar cap even comes into play. Deep-dive is covered in full on its own page.

Where cost comes from

Both caps rely on the same underlying spend figure. Forven does not read an invoice from your provider — it estimates cost locally:

  1. A call returns usage as {input_tokens, output_tokens}.
  2. estimate_cost_usd(provider, model, usage) applies the per-provider, per-model token rates (hard-coded per provider/model in the app) to produce a USD figure.
  3. That figure is persisted on the originating record — cost_usd on the agent_task, or the running total on the deep-dive thread message.

Because the figure is derived from token counts and built-in rate tables, treat it as a faithful estimate, not an exact billing ledger. If a provider changes its pricing, the displayed number can drift from your real invoice. The cap is the reliable hard stop; the displayed dollars are a guide. Your provider's own dashboard remains the source of truth for what you were actually charged.

Steps — set a daily spend ceiling

  1. Open Settings and go to the Billing Guard section (under the Models/Agents area).
  2. Set agent_daily_cost_cap_usd to your daily ceiling in USD — for example, an illustrative 5.00. Leave it at 0 to keep the cap disabled.
  3. Save. The value is written to the settings key-value store (forven:settings).
  4. To bound interactive work too, set deepdive.cost_cap_usd to your per-thread ceiling (default $5).
  5. Watch the day's accumulated spend on the Diagnostics cost dashboard.

What you'll see: once the daily cap is set and the day's spend reaches it, new agent tasks stop launching and report "Daily cap reached" — they sit queued until the next day or until you raise the cap. In a deep-dive thread that hits its per-thread cap, the next message is rejected with a cost-cap error rather than streaming a reply, and the thread's cost display shows it is at the ceiling.

Monitoring spend

You can watch cost in three places, each mirroring where it is tracked:

  • Diagnostics → Cost dashboard — the day's aggregate agent spend, the figure the daily cap is checked against.
  • Settings → Billing Guard — where both caps are set and read back.
  • Deep-dive → cost display — the running per-thread total for an open deep-dive chat.

Spend accrues per call as cost_usd, so the dashboards reflect persisted history, not a live meter. Each agent task carries its own cost_usd; the daily figure is their sum for the current day.

Raising or removing a cap

  • Raise the daily cap in Settings → Billing Guard. A blocked queue resumes launching as soon as the day's spend is back under the new ceiling.
  • Disable the daily cap by setting agent_daily_cost_cap_usd back to 0.
  • Adjust the deep-dive cap by updating deepdive.cost_cap_usd (also exposed on the deep-dive cost display). A thread already over its cap will accept turns again once the cap is above its accumulated total — or start a new thread to reset to zero.

Cap changes take effect on the next check; tasks already blocked are re-evaluated against the new value rather than failed outright.

Caveats (beta)

  • The daily cap is disabled by default (0). If you want a budget, you must set one — Forven will not impose a ceiling for you.
  • Cost figures are estimates from token counts and built-in rate tables, not your provider's invoice. Reconcile against your provider dashboard for exact charges.
  • The daily cap gates task launches, not work already in flight: a task that is already running will finish and book its cost even if that pushes the day over the line.
  • The two caps are independent. A generous daily cap does not loosen the per-thread deep-dive cap, and vice versa.

Forven is a research tool. Nothing here is financial advice, and no number shown — cost or backtest result — is predictive of future outcomes. Cost controls bound your spend on model inference; they do not bound, and say nothing about, trading risk. For that, see risk controls.