Scheduler & jobs

The Forven scheduler — the 35+ built-in cron and interval jobs that drive the pipeline, how to enable, disable, and tune their cadence from /ops.

The scheduler is the heartbeat of the Forven lab. It runs the 35+ background jobs that turn the pipeline from a static set of stages into a moving research process: promoting strategies, computing hypothesis verdicts, scanning for live signals, learning daily, and pruning the database. Every autonomous thing Forven does, it does because a scheduler job ran on a tick. This page is the operator's reference to that job system — what the jobs are, how their cadence is decided, and how you enable, disable, and tune them.

You drive the scheduler from the Scheduler section of the Operations console (/ops). The console is where you flip the switches; this page is where you learn what each switch does.

Forven is a research tool. The scheduler automates a research process — it does not create an edge, and no cadence setting guarantees against loss. Nothing here is a prediction, a recommendation, or financial advice.

How the scheduler works

A scheduler job is a named task — either cron (runs at clock times) or interval (where schedule_expr is the period in milliseconds, e.g. 300000 for five minutes) — managed by the scheduler_jobs table in the local SQLite store. Each row carries the job's command, schedule_type, schedule_expr, enabled flag, last_run_at, last_error, and next_run_at.

On each tick the scheduler looks for due jobs, claims them, and runs them. Two switches sit in front of autonomous work:

  • System pause (the master brake). is_system_paused() is the trading halt. It is enforced at the exchange risk gate — while paused, is_trading_allowed() returns false so no new orders are placed. It is orthogonal to the autonomy mode and does not, on its own, freeze the scheduler's research jobs; it stops money from moving.
  • Pipeline autonomy mode. This is what gates the scheduler's autonomous jobs. In manual mode, is_autonomy_paused() is true and the scheduler skips all autonomous jobs on the tick. In semi_auto, is_generation_paused() is true, so the generation jobs (forven-crucible-planner, forven-crucible-discovery, forven-ideation-daily, forven-auto-intake, and the validation cycle forven-testing-cycle) stay frozen while scanning, graduation, and the gauntlet step-loop keep running. In auto, everything runs.

These two switches are orthogonal and live on the autonomy modes page. To both freeze research and halt trading you set manual mode and pause. The scheduler obeys the mode: a job only fires when its kind is permitted by the current mode.

Generation pause is a third, finer switch. is_generation_paused() freezes the generation set — the crucible planner, crucible discovery, ideation, auto-intake, and the validation/testing cycle (forven-testing-cycle) — while scanning, execution, graduation, and the gauntlet step-loop keep running. It is the safe way to let an in-flight backlog drain through the gauntlet without piling more new candidates on top. Resuming generation has no effect while the mode is manual; change the mode instead.

Startup catch-up

When the app restarts, jobs that came due while it was off are stale. Rather than firing a backlog of overdue runs all at once, the scheduler collapses any job that is more than a minute stale into a single immediate run. This avoids queue-flooding after an app restart — you get one fresh run, not a thundering herd.

The built-in jobs

Forven ships with 35+ default jobs, created on first run with their next_run_at already computed. They fall into a handful of functional groups. The names below (for example forven-crucible-planner, forven-scanner-hourly, forven-testing-cycle) are the real job identifiers you will see in the /ops Scheduler list.

GroupRepresentative jobsWhat they do
Discovery & generationforven-crucible-planner, ideation/auto-intakeInvent new hypotheses and strategy candidates. Frozen by generation pause and in manual/semi_auto.
Testing & promotionforven-testing-cycle, forven-gauntlet-step-loop, promotion loopRun strategies through the gauntlet robustness battery and evaluate promotion gates. Note: forven-testing-cycle is itself part of the generation set, so it freezes under generation pause; the gauntlet step-loop keeps draining.
Hypothesis lifecycleverdict loop (~5 min), revisit loop, graduation sweepCompute proven / disproven / researching verdicts, graduate winners, and re-activate graduated theses when their revisit interval is due.
Paper graduationpaper-promotion / paper-graduation step jobsMonitor strategies in paper and advance the ones that clear the gate.
Live scanningforven-scanner-hourly (the "Live Scanner Execution Worker"), forven-scanner-signalThe multi-strategy scanner: the execution worker evaluates signals and manages open paper positions (exits/stops), while forven-scanner-signal is the signal-only pass. Distinct from the autonomous research daemon.
Daily learningdaily learning / post-mortem jobsClose the learning loop: roll up outcomes into memory and lessons.
Data collectiondata_manager_collect_*Keep OHLCV, funding, and other market streams fresh.
Maintenanceforven-db-maintenance, source reconciliationPrune aged rows, checkpoint the WAL, and pre-compute out-of-band gate inputs. See maintenance.

A single workflow can span several jobs. The gauntlet, for instance, is an async workflow whose steps are claimed and completed across multiple scheduler ticks — walk_forward, then cost_stress, then monte_carlo, and so on — rather than in one blocking run. That is why a strategy can sit in gauntlet for a while: the battery is paced across ticks by design.

Viewing and tuning jobs

Open /ops and find the Scheduler section. Each row shows the job's name, enabled state, last_run_at, last_error, and next_run_at. Click a job to see its command, schedule_type (cron or interval), and schedule_expr.

To change a job, the console calls PATCH /api/scheduler/{job_id} — to toggle enabled or to update cadence. Changes persist in the scheduler_jobs table, and the next tick uses the updated schedule_expr to recompute next_run_at.

# Disable a single job by id (operator-gated; requires FORVEN_OPERATOR_KEY if set)
Invoke-RestMethod -Method Patch `
  -Uri "http://127.0.0.1:8003/api/scheduler/forven-scanner-hourly" `
  -ContentType "application/json" `
  -Body '{"enabled": false}'

# Re-enable it
Invoke-RestMethod -Method Patch `
  -Uri "http://127.0.0.1:8003/api/scheduler/forven-scanner-hourly" `
  -ContentType "application/json" `
  -Body '{"enabled": true}'

Some cadences are clamped. The crucible planner ships with a 5-minute default interval (_CRUCIBLE_PLANNER_INTERVAL_SECONDS); a settings override is clamped to a 1-minute floor (and a 1440-minute ceiling) by an integer clamp (_coerce_int), so you cannot drive it faster than once a minute. These clamps are compute-safety guards — they stop a misconfigured cadence from hammering the lab.

Throughput control

Rather than tune dozens of jobs by hand, you can let Forven scale cadences for you. When throughput_auto_scheduler_control=true in settings (the default), a runtime override (_apply_runtime_scheduler_overrides()) recomputes the cadence of the generation, testing, graduation, and scanner jobs from a handful of per-loop interval settings in forven:settings:

  • crucible_planner_interval_minutes, ideation_interval_minutes, testing_interval_minutes, graduation_interval_minutes
  • scanner_signal_interval_minutes, scanner_execution_interval_minutes

Lower an interval to make that loop run more often (the lab works through the backlog faster); raise it to slow the burn. The values are clamped to a 1-minute floor. These overrides are the knob you reach for when you want the lab to do more research per hour and your machine and model budget can take the load. Per-job PATCH overrides still work for the cases where you need to single out one loop.

A separate setting, adaptive_pipeline_throughput_enabled (off by default), lets the testing cycle auto-scale its own pacing toward a pipeline_target_clear_hours goal.

# Speed up the testing and scanner loops (cadences recompute on the next tick)
Invoke-RestMethod -Method Put `
  -Uri "http://127.0.0.1:8003/api/settings/pipeline" `
  -ContentType "application/json" `
  -Body '{"testing_interval_minutes": 5, "scanner_execution_interval_minutes": 5, "throughput_auto_scheduler_control": true}'

Stale locks and zombie threads

When a job runs, the scheduler stamps a running_since lock in the DB so the same job cannot start twice. If a job hangs or times out, that lock could be left held — so the scheduler self-heals.

On each tick (and at startup, via reset_scheduler_job_locks), recover_stale_scheduler_job_locks() checks each held lock. It will not force-recover while a background task (asyncio.Task.done() is false) or a worker thread is still alive — that is the zombie-thread tracking behaviour (ticket B-30), which holds the lock until the thread exits so the job cannot run twice. Only if no live task or thread is found, and running_since is older than the stale threshold (which varies by job kind, capped at a 3900s absolute maximum), does it force-recover the lock by clearing running_since. The next tick then immediately re-runs the job.

A "running" job is not always stuck. Because recovery deliberately keeps a lock while a worker thread is alive, a job that has been "running" for a while usually means a slow external call — a sluggish LLM provider or exchange — not a hang. If a lock genuinely never clears, see troubleshooting.

Steps: enable and tune a job

A short sequence to bring a job under your control from /ops.

  1. Open /ops and scroll to the Scheduler section.
  2. Find the job by name (for example forven-scanner-hourly). Read its last_run_at, last_error, and next_run_at.
  3. To pause just that loop, toggle it off (PATCH /api/scheduler/{job_id} with {"enabled": false}); toggle it back on the same way.
  4. To change how often it runs, edit its cadence — the next tick recomputes next_run_at from the new schedule_expr.
  5. To scale several loops at once instead, leave the individual jobs alone and adjust the per-loop interval settings (testing_interval_minutes, scanner_execution_interval_minutes, and friends) with throughput_auto_scheduler_control enabled.
  6. Confirm the change took: the row's next_run_at should advance to a sensible future time, and last_error should stay empty.

What you'll see

In the /ops Scheduler list, a healthy job has a future next_run_at, a recent last_run_at, and an empty last_error. After you disable a job its next_run_at stops advancing; after you re-enable it the next tick fills in a fresh next_run_at. After a throughput change, the affected jobs' next_run_at values shift to reflect the new cadence on the following tick. Your first sign of a stuck loop is a row whose last_error has filled in or whose next_run_at never moves.

Caveats

  • The scheduler obeys the autonomy mode first. No cadence you set matters while manual mode blocks the job. (System pause is a separate trading halt — it stops orders at the exchange gate, not the scheduler's research loops.) Check the mode before assuming a job is broken.
  • Cadence floors are real. The crucible planner (and similar compute-heavy jobs) clamp to a minimum interval; you cannot drive them arbitrarily fast.
  • Held locks can be legitimate. Zombie-thread tracking keeps a lock while a worker thread lives, so a long-"running" job may just be waiting on a slow LLM or exchange call rather than being hung.
  • Startup collapses overdue runs. After a restart, jobs more than a minute stale fire once, not once per missed interval — do not read a single catch-up run as a misfire.
  • The numbers here — the 3900s stale-lock cap, the 5-minute crucible-planner default (1-minute clamp floor), the 5-minute verdict-loop tick, "35+" jobs — are defaults from the current build, not guarantees, and may change between releases.