Backend Business Localization Plan¶

Goal¶

Define how IRIS should support multilingual user-facing behavior at the backend level so that:

users can understand business narration in API responses, Home Assistant surfaces, event-like delivery surfaces, and operation-status flows;
machine contracts remain canonical and automation-safe;
accepted service/engine and presenter boundaries remain intact;
localization fits the existing governance model instead of becoming a separate mini-architecture.

This document covers backend business localization, not frontend i18n.

Architectural Context¶

The plan must work inside already accepted IRIS constraints:

service layer and analytical engines are already separated;
post-commit side effects are already explicit;
runtime is event-driven and governed by control plane;
HTTP API lives under OpenAPI, capability, and analytical cache governance;
Home Assistant is an external automation and notification host;
the AI layer is a separate capability and must not replace deterministic localization.

Current Repository State¶

At the time of this plan, the repository still had several constraints that must be addressed explicitly rather than hidden under abstractions:

signal and portfolio surfaces still depended on reason: str rather than formalized reason taxonomy;
some API boundary contracts still re-exported app-level schemas too directly;
HTTP cache semantics did not yet vary by language;
settings exposed only a single language field rather than a richer locale policy;
runtime stream messages still used text-first legacy payloads;
the HA integration still consumed raw business reason text;
HA static translations and backend business narration were not cleanly separated.

Therefore, localization cannot start from “translating strings.” It must start from canonical taxonomy and a transport-safe foundation.

Problem¶

The backend mostly returned machine-oriented semantics:

statuses and codes;
reason or message text as free strings;
operation messages as unstructured text;
HA payloads and legacy bus messages without formalized descriptor models.

That is acceptable for automation, but insufficient for humans:

users see system vocabulary instead of understandable messages;
free-text reason fields are hard to translate and hard to stabilize;
output language is not properly controlled in API cache semantics or HA delivery;
translation and business truth can easily get mixed inside boundary code.

Scope¶

This plan covers backend-owned human-facing text only:

human-readable business messages in API responses;
event, SSE, or push payloads that are not machine-only;
persisted notifications or similar backend-owned artifacts;
deterministic summaries and briefs that are not AI-generated.

This plan does not cover:

frontend i18n;
arbitrary user content translation from the database;
AI-generated freeform translation as the primary mechanism;
translation of machine identifiers such as BUY, SELL, signal_created, or market_regime_changed.

Core Idea¶

The business logic itself is not translated.

What gets translated is what the system tells a human about the results of that logic.

Correct model for IRIS:

machine outcome remains canonical;
a formalized narration descriptor is built above it;
the descriptor is rendered at the boundary in the requested locale;
humans get understandable text while automation continues to operate on codes.

Non-Negotiable Rules¶

1. Canonical Taxonomy First¶

Localization starts with formalized machine vocabulary, not free text.

Minimum canonical set:

reason taxonomy
message keys
structured message params
stable machine-readable error and status codes

If a domain still depends on reason: str, direct translation of that text is the wrong first step.

2. Machine Contracts Remain the Source of Truth¶

Automation, routing, policies, alerts, and downstream integrations must rely on canonical machine fields, not localized text.

3. Localization Does Not Live in Engines¶

Pure engines and deterministic domain math must not:

accept locale;
query translation catalogs;
render user-facing text;
format labels, percentages, or durations.

4. Localization Does Not Live in Orchestration Services as Inline Strings¶

Services may return codes, facts, and typed results, but they must not become scattered string builders.

5. Boundary Contracts Must Be Separate From App Schemas¶

Do not simply add message_key, message, or message_params to app-level schemas when those same schemas still act as domain or application forms.

Localization should enter through dedicated API or HA envelopes and presenter contracts.

6. One Narration System Per Surface¶

Avoid a parallel world with:

a new descriptor-based API and HA layer;
old text-first runtime messages;
separate ad-hoc operation text.

Each surface needs one target narration model and a clear deprecation path for legacy text.

7. AI Is Not the Primary Translation Mechanism¶

AI may enrich explanation or humanize deterministic outcomes, but it must not replace the canonical descriptor layer.

Migration Matrix for Legacy Fields¶

Current field	Target form	Transitional state
`reason: str`	`reason_code`	`reason` remains temporarily as deprecated legacy text
`message: str`	`message_key` + `message_params` + optional rendered `message`	dual-field transitional contract is allowed on pilot surfaces
`status: str`	`status`	remains canonical
`decision: str`	`decision`	remains canonical
`text` in legacy bus	descriptor or explicit deprecated text field	legacy bus cleanup happens in a separate wave

Target Contract Model¶

1. Machine Outcome¶

Machine outcome remains typed, canonical, and language-neutral.

2. Message Descriptor¶

Above the machine outcome, build a deterministic narration descriptor:

message_key
message_params
surface
optional variant

surface exists so the same semantic message can be adapted to API, HA, notifications, or operation-status use without collapsing everything into one ambiguous key.

variant exists so short, normal, and extended forms can coexist explicitly.

3. Localized Render¶

Boundary adapters render:

message_key + params + locale -> human text

4. Final User-Facing Payload¶

The final payload may contain both:

machine fields
rendered human text

message_params should carry semantics, not preformatted display artifacts.

Target Layer Model¶

1. `core/i18n`¶

Backend localization needs a shared core:

supported locale set
locale resolver
translator
formatter for numbers, percentages, counts, dates, and durations
fallback policy

2. Domain Narratives¶

Domains that need human-readable backend output should provide a pure narrative layer.

Role:

accept typed domain result or read model;
return MessageDescriptor;
perform no IO;
know nothing about HTTP, HA, Redis Streams, or AI providers.

3. Boundary Adapters¶

Localization rendering lives in presenters and adapters.

They:

take MessageDescriptor;
choose locale;
call the translator;
add rendered text to response or event payload.

4. API-Localized Contracts Separate From App Schemas¶

Pilot rollout and long-term rollout must explicitly separate:

app and domain result contracts
localized API envelopes

Locale and Configuration Policy¶

Canonical Locale Model¶

Supported locales should be modeled in proper BCP 47 form.

ua is not a canonical language tag and should not be emitted as output.

Migration path:

accept ua temporarily as input alias;
normalize it to uk;
never store or emit ua as effective locale.

Repo-Level Settings¶

A single IRIS_LANGUAGE-style field is not enough forever.

Target configuration should support:

instance default language;
fallback language;
normalization policy;
future expansion for more locale-aware behavior.

The current language field may remain as a compatibility alias during migration.

Resolution Order¶

explicit locale override on the surface
stored user or integration preference when available
settings.language
fallback locale

All input is normalized to the supported locale set.

Surface-Specific Overrides¶

A minimal starting point may support:

instance default language
explicit override header such as X-IRIS-Locale

Cache and OpenAPI Policy¶

Localization must not be layered onto current read endpoints without transport-safe semantics.

1. Stable Contract Per Endpoint¶

At the start, avoid dual-mode endpoint shapes.

Reasons:

fixed response_model is part of governance;
committed OpenAPI snapshots must stay stable;
analytical cache semantics are not ready for ambiguous locale-dependent variance under one shape.

Practical rule:

either an endpoint stays canonical-only;
or it gets an explicit localized read contract.

2. Language-Aware Cache Semantics¶

As soon as an endpoint renders locale-dependent narration, caching must account for language.

Otherwise the system will return:

the wrong language from cache;
incorrect 304 Not Modified;
stale locale variants under the same ETag.

3. Snapshot Surfaces¶

Localization does not remove existing analytical snapshot requirements.

Locale becomes one more dimension of the transport contract.

Formatting Policy¶

Localization without formatting policy collapses quickly.

Define formally:

number formatting
percentage formatting
signed deltas
counts
date and datetime presentation
durations

The rule is simple:

message_params carry semantics;
translator and formatter layers handle display formatting.

Persisted Artifacts Policy¶

Storing only message_key + params is useful but not enough for audit-heavy surfaces.

If the catalog changes later, the same historical artifact might render differently.

Therefore persisted notifications, operation history, and similar surfaces need at least one of:

catalog_version alongside the descriptor
a materialized render snapshot
or both for especially audit-sensitive surfaces

HA-Specific Policy¶

1. HA-Side Static Integration Strings¶

Strings like config-flow labels, static UI labels, and integration chrome are the responsibility of the HA integration itself.

2. Backend-Rendered Business Narration¶

Business narration about domain outcomes is the responsibility of the backend.

HA should not rebuild business messages from raw fields.

HA Payload Rule¶

HA-facing payloads should contain:

machine truth for automation;
descriptor or localized narration for humans.

The current text-first transmission of business reason in HA event payloads is considered legacy and must be migrated.

Legacy Message Bus Policy¶

The runtime stream message layer still contains a text-first model.

That cannot remain as a second independent narration system.

Plan requirement:

either migrate legacy message bus payloads to descriptor-based shape;
or put them on an explicit deprecation path and remove them from canonical human-facing surfaces.

Operations and Shared HTTP Contracts¶

Operation and status narration are in scope, but not in the pilot wave.

Current shared contracts still include text-first fields like message: str.

Implications:

localizing operation and status flows requires changes to shared HTTP contracts;
that should not be mixed into the first presenter-layer pilot.

Relation to the AI Layer¶

AI humanization must not be the first step in multilingual support.

First build deterministic backend narration.

Only then may AI:

rephrase;
extend explanation;
adapt tone;
build briefs on top of the canonical descriptor layer.

Why Not `gettext` Like Django¶

Django-style gettext is good for:

HTML templates
static UI strings
view-layer translation in a traditional web app

But it is not the right primary architecture for IRIS because:

the backend is typed API-first, not HTML-first;
value lives in business narration rather than static page templates;
IRIS needs machine codes and localized text at the same time;
transport-safe cache and OpenAPI policy matter as much as string lookup.

The better path for IRIS is:

descriptor-based narration
a typed translator core
richer ICU/Babel-like formatting later if needed

Proposed File Structure¶

Practical target:

backend/src/core/i18n/
  contracts.py
  locales.py
  resolver.py
  translator.py
  formatter.py
  catalogs/

backend/src/apps/<domain>/
  narratives.py
  api/
    presenters.py
    localized_contracts.py

If a domain does not need a full package, narratives.py is acceptable, but not inline strings inside services.

Rollout Plan¶

Wave 0. Canonical Taxonomy First¶

define reason taxonomy
keep legacy reason only as deprecated compatibility
fix migration matrix for shared fields

Wave 1. Transport-Safe Foundations¶

add core/i18n contracts, translator, resolver, and formatter
fix catalog format
define repo-level locale settings
normalize ua -> uk
update cache semantics for locale-aware surfaces

Wave 2. Presenter-Layer Pilot¶

start with signals and portfolio
add pure narrative descriptor builders
separate localized API contracts from app schemas
localize through presenters only

Wave 3. HA and Legacy Bus Cleanup¶

migrate HA-facing business payloads to descriptors
decide which strings remain HA-side static strings
remove reliance on raw reason
migrate or explicitly deprecate legacy message bus text

Wave 4. Operations and Shared HTTP Contracts¶

localize accepted, job, and operation narration
update shared HTTP contracts
define persistence policy for operation history render snapshots

Wave 5. Expansion¶

extend to market_structure, anomalies, predictions, and cross_market
connect the same locale contract to AI-derived surfaces
forbid AI-only narration without a canonical descriptor underneath

Architecture Checks¶

Once the layer stabilizes, useful automated checks include:

services and engines do not import translator directly
pilot domains no longer depend on raw reason: str as source of truth
API-localized contracts stay separate from app schemas
locale-aware endpoints use language-aware cache semantics
catalogs cover required keys
missing translations produce typed fallback instead of silent empty strings
persisted audit-heavy artifacts carry catalog_version or materialized render snapshots

Definition of Done¶

Backend business localization is considered correctly implemented only if:

pilot domains no longer depend on raw reason as the canonical explanation field;
machine contracts remain language-neutral;
localized narration is built in a separate pure layer;
localized API contracts are separate from app schemas and read models;
locale is chosen formally and predictably;
locale-aware reads do not break OpenAPI or cache semantics;
HA receives both machine fields and human-readable business narration;
legacy text-first runtime paths are either migrated or explicitly deprecated;
the backend can render at least ru/en/es/uk;
no domain engine knows about translation;
the AI layer does not replace deterministic localization.

Main Conclusion¶

For IRIS, the correct path is neither “translate reason strings directly” nor “sprinkle _() through services.”

The correct path is:

formalize canonical taxonomy;
add a transport-safe localization foundation;
localize boundary presenters on pilot domains;
only then move the model into HA, legacy bus, and shared operation surfaces.

That path is compatible with the current IRIS architecture and does not break the cleaned-up service/runtime boundaries.