Lazy Investor AI Plan¶

Status¶

This document is no longer an abstract vision note.

As of 2026-03-14, the rollout described here reached a platform-grade state in code and is backed by normative architecture artifacts.

Its purpose is to record how the AI layer should fit into the already accepted IRIS architecture model.

This is a bridge document between the current hypothesis_engine and the future shared core.ai product capability layer.

Normative rollout artifacts now exist in:

ADR 0015 and companion architecture policy documents
AI runtime policy
AI performance budgets
delivery and architecture governance docs

Already implemented:

migration of hypothesis_generate to a capability-aware executor
brief_generate as an analytical snapshot surface
explain_generate as a bounded explanation capability
AI operator and admin surfaces in the existing control_plane
prompt, task, and provider separation with enforced prompt policy

Goal¶

Build a unified AI layer for IRIS that:

goes beyond hypothesis_engine only;
works with multiple real providers;
is enabled by capability-by-configuration rather than a single boolean feature flag;
supports hypothesis_generate, notification_humanize, brief_generate, and explain_generate;
respects the language from instance settings or an explicitly passed language / locale;
does not break the existing service/runtime governance model;
does not turn deterministic domains into an LLM-first system.

This must not become “yet another AI app.” It must become a proper platform layer for the product scenario of a calm, low-friction investor workflow.

Repository Architectural Context¶

The AI plan must live inside the already accepted operating model of the repository.

Implications:

AI must fit into core/, apps/, and runtime/, not create a parallel architecture;
AI governance must use the same policy-doc and ADR model as the rest of the repo;
heavy AI paths must obey runtime idempotency, retry, and concurrency rules;
AI surfaces must respect the existing mode/profile-aware HTTP model;
briefs and AI-derived reads must fit analytical snapshot semantics rather than living on an isolated “AI island.”

Current Repository State¶

IRIS already has a working AI contour, but it is not yet platform-grade.

What already exists:

generation flow for hypotheses;
deterministic evaluation jobs for hypotheses;
read surface, job triggers, and SSE insights.

Main architectural debts:

enable_hypothesis_engine is still the primary switch.
Prompt data and infrastructure routing are still too close together.
HYPOTHESIS_OUTPUT_SCHEMA is not yet a true output-contract enforcement path.
Heuristic fallback is still too magical.
Generation and evaluation are conceptually different, but still named too generically in the product model.

Main Architectural Principle¶

AI is not “new business logic.”

AI is the next layer on top of already existing machine-canonical domains.

That means:

signals, predictions, cross_market, portfolio, anomalies, and market_structure continue to compute canonical facts themselves;
AI operates on typed facts, events, and read models;
AI explains, humanizes, summarizes, and generates hypotheses or briefs;
AI does not become the source of trading truth.

Non-Negotiable Rules¶

1. Deterministic Domains Remain Canonical¶

LLMs do not replace:

signal generation
market structure
portfolio rules
risk rules
automation-critical decisions

All automation-critical outputs remain deterministic.

2. LLM Adapters Do Not Live in Domain Services¶

Do not pull httpx, provider SDKs, OpenAI, Ollama, or similar clients into:

domain services
analytical services
regular orchestration services

All external AI calls must live in a shared AI platform layer.

3. Feature Flags Stop Being the Primary Source of Truth¶

A capability is available only if:

at least one real provider is configured;
that provider is allowed for the capability;
runtime policy permits execution in the current mode/profile;
health and degraded-state policy do not block execution.

A single boolean like enable_hypothesis_engine must not decide the fate of the entire AI layer.

4. Prompts Must Not Control Infrastructure Routing¶

Prompts and prompt variables may store only:

task-specific context;
style and wording hints;
safe semantic defaults.

Prompts must not store or mutate:

provider routing;
base URLs;
auth or transport config;
capability enablement.

5. Heuristic Fallback Is a Degraded Strategy, Not a Peer Provider¶

Rule-based fallback is useful, but it must not:

count as “AI enabled”;
hide provider outage;
be treated as a peer to real LLM providers;
silently mask validation or network failures.

6. Structured-First, Humanized-Second¶

Typed contract comes first in all AI use cases.

The system must first produce a typed result or typed canonical input/output envelope, and only then add humanized text where the capability actually requires it.

7. AI Execution Must Respect the Language / Locale Contract¶

AI capabilities must not guess output language on their own.

Language must come from a formal and predictable source:

explicit language / locale for the current request, job trigger, or delivery target;
stored preference if such a layer exists later;
instance default from settings.language;
fallback to en.

Machine-canonical fields remain language-neutral.

8. Context Serialization Is a Separate Execution Concern¶

Typed context bundles must not leak directly into prompts “as whatever happened to be available.”

Flow:

a deterministic context builder assembles typed facts;
the execution layer chooses a context transport format;
only then is context serialized into prompt input.

Capability Model¶

Top-level capabilities must stay compact.

The platform layer should define a small set of stable capabilities, while detail lives in typed input and output contracts rather than in a bloated registry surface.

Capability, Task, Prompt, Provider: Explicit Separation¶

The system must distinguish four different concepts.

Capability¶

This is the runtime, policy, and exposure unit.

Capability defines:

whether something may run at all;
which providers are allowed;
which execution modes are valid;
which degraded policy applies;
which API and runtime surfaces are exposed.

Task¶

This is the concrete prompt contract inside a capability.

Tasks are not provider routing and not feature flags.

Prompt¶

This is a versioned template, schema, or style artifact for a task.

Prompt is responsible for:

semantic defaults;
wording constraints;
safe style behavior.

Provider¶

This is the infrastructure adapter.

It owns transport details, auth, limits, and concrete execution mechanics.

Language / Locale Contract¶

Language must be part of the AI execution contract, not prompt magic.

Resolution Order¶

explicit language for the current response
future stored preference target
settings.language as instance default
fallback en

Prompt Interaction¶

Prompt may use language only as semantic execution input:

wording selection
tone/profile selection
language-aware output-schema constraints

Prompt must not override the effective language decided by the execution contract.

Execution Metadata¶

Each AI result should preserve:

requested language
effective language
requested provider
actual provider
degraded/offline status

Output Rule¶

If a capability returns human-facing text, it must be generated in the effective language.

Forbidden:

silent provider-default language
mixed-language output without explicit mode
losing the reason why a given language was chosen

Context Transport Contract¶

General Principle¶

Inside domain layers, the source of truth remains the typed context bundle.

At the boundary between domain code and AI execution, that context is transformed into one of the supported transport formats:

json
compact_json
toon
csv

The execution layer chooses this by policy, not the domain service and not the prompt.

Practical Selection Rule¶

json — maximum compatibility and simplest pipeline
compact_json — still JSON, but with reduced noise
toon — repeated row-like objects, logs, candles, tables, metrics
csv — flat table-like data where nesting is unnecessary

Formal Rules¶

one task may allow only a bounded whitelist of formats
prompt does not serialize context itself
the execution layer provides already serialized input plus metadata about the format

Why This Matters¶

Without this policy, the system degrades into a mixture of arbitrary payload shapes and incompatible prompt assumptions.

Prompt / Task Interaction¶

A task may constrain allowed context formats, but the execution layer still chooses the final effective format.

Execution Metadata¶

Each AI execution result should link back to:

context format
prompt version
provider
language
capability
task

Provider Model¶

Replace scattered provider settings with a typed provider registry.

Each provider entry should include:

provider key
transport config
health state
latency and cost metadata
compliance tier
allowed capabilities
optional capability-specific overrides

Requested vs Actual Provider¶

Each execution result must distinguish between:

requested provider
actual provider

This is required for fallback, observability, and auditability.

Output Contract Enforcement¶

“Ask the prompt to return JSON” is not enough.

The platform needs a real schema-first execution path with:

validation
typed result parsing
explicit failure classes
degraded status handling

Minimum statuses:

healthy
degraded
offline
invalid_output
unavailable

LLM output must not silently become “something close enough.”

Shared AI Platform Layer¶

The first safe step is not a new giant app, but a shared core.ai layer.

Target structure:

backend/src/core/ai/
  capabilities/
  contracts/
  execution/
  prompts/
  providers/
  registry/
  validation/
  degraded/

Role:

registry for capabilities and providers
execution engine
prompt policy
output validation
degraded-mode handling
common AI contracts

What Moves First¶

Initial refactor should stay narrow:

ReasoningService becomes thin orchestration over AIExecutor.execute(...)

That yields a platform layer without an immediate large rename of domain apps.

What to Do With the Current `hypothesis_engine`¶

Do not start with a big rename under lazy_investor.

The correct path:

Phase 1¶

Keep src/apps/hypothesis_engine as a domain app but move it onto shared core.ai.

Phase 2¶

Add notification_humanize on top of the shared platform.

Phase 3¶

Add brief_generate as an analytical snapshot surface.

Phase 4¶

Only after at least two new real capabilities stabilize, decide whether a separate src/apps/lazy_investor is justified.

Rule:

Do not create a large apps/lazy_investor package just for naming aesthetics.

Generation vs Evaluation: Explicit Separation¶

This must become an architectural invariant.

`hypothesis_generate`¶

This is an AI capability:

depends on provider availability
uses prompt, task, and provider routing
publishes AI-derived artifacts

`hypothesis_evaluation`¶

This is deterministic service lifecycle:

uses ordinary jobs, locks, and tracked operations
does not depend on real LLM-provider availability
is not disabled together with generation surfaces
remains observable even when AI generation is offline

Read surfaces for hypotheses and evaluations must not disappear only because a provider is absent.

Runtime Gating: Not Only Settings¶

Migration to capability-by-configuration must affect three layers:

HTTP surface mounting
worker-group existence
automatically enabled background jobs

AI availability must not be decided in one place and bypassed in two others.

Mode / Profile Matrix¶

The AI surface must be mode-aware.

Minimum principle:

human-facing reads may exist without generation;
generation, admin, and streaming surfaces do not need to be available in the HA embedded profile;
public availability follows the same governance model as the rest of the HTTP surface.

Failure Domains and Degraded Modes¶

`healthy`¶

at least one real provider exists
validation path works
capability runs in normal mode

`degraded`¶

a fallback chain is used
or the capability is temporarily allowed only via deterministic degraded strategy

Examples:

notification_humanize may degrade to template-based humanization
explain_generate may degrade to a bounded deterministic summary
hypothesis_generate should not silently degrade into pseudo-LLM behavior unless product policy explicitly allows it

`offline`¶

no real providers are available
generation capabilities do not run
read surfaces remain
deterministic evaluation continues
runtime returns typed unavailable or skipped instead of pretending success

Do Not Block the Rest of Runtime¶

This is a separate invariant.

Heavy AI capabilities must never run on shared analytical worker lanes.

Rules:

dedicated AI worker groups
dedicated concurrency budgets
dedicated timeout and performance budgets
no AI outage impact on deterministic signal, prediction, or portfolio paths

Storage and Observability¶

Do not force everything into a universal AIArtifact table.

hypotheses, notifications, and briefs have different lifecycles.

Instead:

keep artifact-specific storage;
standardize a shared execution-metadata envelope.

Minimum metadata:

capability
task
prompt version
provider
language
degraded state
validation status
request / operation linkage
timestamps

For hypotheses, this is an extension of already-strong traceability rather than a redesign.

Minimum metrics:

provider latency
provider failure rate
validation failure rate
degraded/offline rate by capability
worker saturation
output-size and timeout pressure where relevant

Notification Humanization as the First New Capability¶

The first new capability after migration should be notification_humanize, not brief_generate.

Why:

short context
bounded output
easier degraded strategy
clearer product value

Briefs as an Analytical Snapshot Surface¶

brief_generate should behave as an analytical snapshot surface:

async generation
cached read surface
bounded payload
explicit freshness metadata

It must not become a hidden synchronous endpoint that does heavyweight provider work during reads.

Operator / Admin Surface¶

Operator and admin control for AI should live in existing governance surfaces rather than in a disconnected AI mini-admin.

Expose:

provider health
capability availability
prompt policy state
degraded or offline reason
selected execution metrics

Prompt Policy¶

Prompt policy must explicitly constrain:

allowed variables
max context size
allowed context formats
provider restrictions
language behavior
output schema
safety and non-routing guarantees

Rollout Plan¶

Stage 1. Governance and Foundations¶

ADR and policy alignment
capability model
prompt policy
failure-domain rules

Stage 2. `core.ai` Foundation¶

registry
executor
provider adapters
validation

Stage 3. Hypothesis Migration¶

move hypothesis_generate to the shared executor
keep deterministic evaluation separate

Stage 4. Notification Humanization¶

add first new real capability
validate degraded mode and bounded contracts

Stage 5. Briefs¶

add asynchronous brief generation
expose read surface with snapshot semantics

Stage 6. Optional Product-Layer Expansion¶

only if multiple capabilities justify a broader AI product layer

What We Intentionally Do Not Do Right Now¶

turn the platform into an LLM-first architecture
replace deterministic truth with provider output
introduce a giant “lazy investor” app for naming alone
let prompts control transport or provider routing
let AI ride on shared critical worker lanes

Definition of Done¶

The AI layer is considered integrated correctly when:

shared core.ai exists as the execution foundation;
deterministic domains remain canonical;
capability availability is not controlled by one boolean flag;
prompt, task, capability, and provider are separated;
language is resolved formally and predictably;
output validation is explicit;
degraded and offline states are observable and typed;
generation and evaluation are explicitly separated;
AI workloads cannot destabilize deterministic runtime paths.

Main Conclusion¶

The correct path for IRIS is not “add another AI app.”

The correct path is:

formalize shared AI execution foundations;
keep deterministic truth in domain layers;
expose compact product capabilities on top of that foundation;
grow AI surfaces only where they add bounded, explainable value.

Lazy Investor AI Plan¶

Status¶

Goal¶

Repository Architectural Context¶

Current Repository State¶

Main Architectural Principle¶

Non-Negotiable Rules¶

1. Deterministic Domains Remain Canonical¶

2. LLM Adapters Do Not Live in Domain Services¶

3. Feature Flags Stop Being the Primary Source of Truth¶

4. Prompts Must Not Control Infrastructure Routing¶

5. Heuristic Fallback Is a Degraded Strategy, Not a Peer Provider¶

6. Structured-First, Humanized-Second¶

7. AI Execution Must Respect the Language / Locale Contract¶

8. Context Serialization Is a Separate Execution Concern¶

Capability Model¶

Capability, Task, Prompt, Provider: Explicit Separation¶

Capability¶

Task¶

Prompt¶

Provider¶

Language / Locale Contract¶

Resolution Order¶

Prompt Interaction¶

Execution Metadata¶

Output Rule¶

Context Transport Contract¶

General Principle¶

Practical Selection Rule¶

Formal Rules¶

Why This Matters¶

Prompt / Task Interaction¶

Execution Metadata¶

Provider Model¶

Requested vs Actual Provider¶

Output Contract Enforcement¶

Shared AI Platform Layer¶

What Moves First¶

What to Do With the Current hypothesis_engine¶

Phase 1¶

Phase 2¶

Phase 3¶

Phase 4¶

Generation vs Evaluation: Explicit Separation¶

hypothesis_generate¶

hypothesis_evaluation¶

Runtime Gating: Not Only Settings¶

Mode / Profile Matrix¶

Failure Domains and Degraded Modes¶

healthy¶

degraded¶

offline¶

Do Not Block the Rest of Runtime¶

Storage and Observability¶

Notification Humanization as the First New Capability¶

Briefs as an Analytical Snapshot Surface¶

Operator / Admin Surface¶

Prompt Policy¶

Rollout Plan¶

Stage 1. Governance and Foundations¶

Stage 2. core.ai Foundation¶

Stage 3. Hypothesis Migration¶

Stage 4. Notification Humanization¶

Stage 5. Briefs¶

Stage 6. Optional Product-Layer Expansion¶

What We Intentionally Do Not Do Right Now¶

Definition of Done¶

Main Conclusion¶

What to Do With the Current `hypothesis_engine`¶

`hypothesis_generate`¶

`hypothesis_evaluation`¶

`healthy`¶

`degraded`¶

`offline`¶

Stage 2. `core.ai` Foundation¶