NafqaX System Tree & Rust Migration Checklist

Operational checklist for the full NafqaX system tree, Python monolith status, Rust production-adjacent modules, and runner-command-mailbox migration rollout.

1. Production Status

Orchestratorhealthy
Streamhealthy
Release ativocoordinator-runner-ws-task-completion-f385dcb6-20260509T193744Z
Backend production runtimeRust global active for command-mailbox core with fallback OFF; Python retained for rollback/source parity
Rust modules in productionmemory embedded + command-mailbox global core
Phase 5 command-mailboxPARTIAL_OK; SMALL_BATCH_DECOMPOSITION_OK; FALLBACK_REMOVAL_CANARY_OK; RUST_COMMAND_MAILBOX_IMPLEMENTED_SCOPED; NEXT_BATCH_SCOPED_EXPANSION_OK; ROUND1_FINAL_STATE_AUDIT_OK; ROUND2_MINIMAL_SENTINEL_OK; ROUND3_PRIMARY_CUTOVER_CANDIDATE_OK; ROUND4_ROLLBACK_DRILL_OK; ROUND5_DECOMMISSION_PACKAGE_PREPARED; FIRST_SOURCE_REMOVAL_PROXY_RELEASE_PROMOTED; ROUND1_SCOPED_PYTHON_HANDLER_SURFACE_REMOVED_LOCAL; SCOPED_PYTHON_HANDLER_GUARD_RELEASE_PROMOTED; SCOPED_PYTHON_HANDLER_GUARD_ACTIVE_MINIMAL_OK; RUNNER_COMMAND_HANDLER_GUARD_RELEASE_PROMOTED; RUNNER_COMMAND_HANDLER_GUARD_ACTIVE_MINIMAL_OK; NO_SCOPED_PYTHON_HANDLERS_RELEASE_PROMOTED; NO_SCOPED_PYTHON_HANDLERS_ACTIVE_MINIMAL_OK; LAZY_PYTHON_HANDLERS_RELEASE_PROMOTED; LAZY_PYTHON_HANDLERS_ACTIVE_MINIMAL_OK; COMMAND_MAILBOX_SCOPED_CORE_FINALIZED; GLOBAL_COMMAND_MAILBOX_RUST_ACTIVE_OK; TASK_COMPLETION_REVIEW_RELEASE_PROMOTED; TASK_OUTPUT_COMPLETION_REVIEW_RELEASE_PROMOTED; AUTO_TASK_CONTROL_PAYLOAD_RELEASE_PROMOTED; RUNNER_WS_TASK_COMPLETION_RELEASE_PROMOTED
DB Foundationrunner_control ledger + timeline applied
Escopo Rust validadoglobal core runtime active; off-old-allowlist probe 306 passed mailbox/cursor/ACK
runner-command-mailbox-serviceactive liveness; global Rust core runtime active
Rust liveness8099 /health=200 /readiness=200
DB-write readiness gatecanary evidence validated
Scope noteglobal_scope_enabled=1; canary_only=0; READ_PATH=false; AUTHORITATIVE=0; DUAL_WRITE=0
Global authoritative / dual-writeOFF
Ultima atualizacao UTC2026-05-09 06:38 UTC

2. app.py Monolith Summary

backend/app.py39.455 linhas
Funcoes inventariadas791
Route decorators inventariados169
Dominios/nos catalogados25
Statusmonolito ativo em producao
Riscoeditar app.py sem mapa pode reintroduzir bugs

3. System Tree Checklist

Each node tracks the full system tree, not only the Rust migration.

01. health/readiness/bootstrap PYTHON_PRODUCTION_ACTIVE
Responsibilityhealth endpoints, readiness, version, metrics, app bootstrapRust targetobservability-health-readinessPriorityP1 supportNext stepkeep Python authoritative until explicit service rolloutRiskfalse readiness or exposing sensitive runtime detail
02. auth/users/orgs/projects PYTHON_PRODUCTION_ACTIVE
Responsibilityauth, user sessions, orgs, projects, membershipRust targetauth-users-orgs-projectsPriorityP4Next stepdocument contracts before extractionRiskbroad auth/session blast radius
03. RBAC/membership/grants PYTHON_HOTFIXED
Responsibilityproject roles, runner operational authorization, grantsRust targetshared-auth, runner-rbac-policyPriorityP0 supportNext steppreserve explicit field-based RBACRiskfalse allow/deny and platform admin overreach
04. runner device registry PYTHON_HOTFIXED
Responsibilityrunner registration, device lifecycle, device-scoped project resolutionRust targetrunner-device-registryPriorityP1Next stepaudit after command-mailbox global runtime stabilizesRiskwrong device identity or stale project binding
05. runner token/QR approval PYTHON_HOTFIXED
Responsibilityrunner approval requests, mobile approval, QR pickupRust targetrunner-token-approvalPriorityP2Next stepleave in Python until rollout pattern is provenRiskapproval replay or stale pickup
06. runner config/sync PYTHON_HOTFIXED
Responsibilityrunner config, sync, device-aware project mappingRust targetrunner-device-registry + runner-project-bindingPriorityP1Next stepkeep device-first resolution invariantRiskdefault global project regression
07. runner commands PYTHON_PRODUCTION_ACTIVE RUST_PARTIAL_SCOPE_VALIDATED
Responsibilitycommand queue, command write, next polling, claim lifecycleRust targetrunner-command-mailbox-servicePriorityP0Next stepuse Round 5 package for a separately approved source-removal/proxy patchRiskduplicate command, wrong runner/project, fallback drift, or accidental off-scope routing
08. runner command status/logs/results PYTHON_PRODUCTION_ACTIVE PYTHON_HOTFIXED
Responsibilitycommand status, logs, result idempotency, exec resultRust targetrunner-command-mailbox-service laterPriorityP0Next stepkeep out of partial decommission until a separate QA gateRiskstatus leakage, sensitive log material, lost result
09. mailbox PYTHON_PRODUCTION_ACTIVE RUST_PARTIAL_SCOPE_VALIDATED PYTHON_HOTFIXED
Responsibilitymailbox input, pull, ack, cursor, epoch, runner deliveryRust targetrunner-command-mailbox-servicePriorityP0Next stepuse Round 5 package; keep off-scope traffic on Python until approved cutoverRiskoff-scope mutation, wrong epoch/cursor, duplicate ACK, or unexpected Python fallback inside scope
10. cli send/stream/chat PYTHON_PRODUCTION_ACTIVE RUST_PARTIAL_SCOPE_VALIDATED
Responsibilitychat send, raw send, stream, session runner routeRust targetrunner-command-mailbox-service for scoped command/mailbox writes; later stream-eventsPriorityP0/P3Next stepkeep stream/chat and off-scope traffic on Python; monitor scoped Rust command/mailbox writesRiskcoordinator stays thinking, fallback drift, or routes to old runner
11. coordinator session/runtime PYTHON_HOTFIXED
Responsibilitycoordinator start, stop, status, runtime/fleet stateRust targetcoordinator-session-runtimePriorityP2Next stepwait for command/mailbox rollout lessonsRiskruntime split-brain, stale coordinator device
12. project artifacts PYTHON_PRODUCTION_ACTIVE
Responsibilityartifact write/read/apply and project output promotionRust targetproject-artifactsPriorityP2Next stepkeep command-backed until result semantics are stableRiskpath safety and stale session project apply
13. runner project binding PYTHON_HOTFIXED
Responsibilitylocal projects, workspace bindings, apply session projectRust targetrunner-project-bindingPriorityP1Next stepaudit after runner-device registry boundary is explicitRiskstale session project switching
14. mobile APIs PYTHON_PRODUCTION_ACTIVE
Responsibilitymobile auth, device sessions, runner approvalsRust targetmobile-auth-mobile-runner-approvalPriorityP3Next stepextract only after runner approval boundaries are explicitRiskmobile session drift and approval identity mismatch
15. admin platform PYTHON_PRODUCTION_ACTIVE
Responsibilityplatform admin users, org plans, audits, repair operationsRust targetadmin-platformPriorityP4Next stepdo not mix with runner operation RBACRiskbroad admin operation blast radius
16. dashboard support APIs PYTHON_PRODUCTION_ACTIVE
Responsibilitymessages, timeline, setup wizard, project info, dashboard helpersRust targetpossible dashboard-support laterPriorityP3Next stepkeep frontend debt separate from Rust rolloutRiskUI route/auth expectations and stale browser state
17. stream/SSE/events PYTHON_PRODUCTION_ACTIVE
ResponsibilitySSE, mailbox stream, CLI stream, event deliveryRust targetstream-eventsPriorityP3Next stepwait until command/mailbox source of truth is settledRiskmissed events, duplicates, backpressure
18. AI providers PYTHON_PRODUCTION_ACTIVE
Responsibilityprovider accounts, configs, catalog, execution providersRust targetai-providers or remain PythonPriorityP4Next stepcredential safety audit before any extractionRiskcredential handling and provider behavior
19. agents/sessions/tasks PYTHON_PRODUCTION_ACTIVE
Responsibilityagent sessions, tasks, interactions, process stateRust targetagents-sessions-tasksPriorityP3Next stepseparate monolith auditRisktask lifecycle regressions and old state planes
20. workflows/templates PYTHON_PRODUCTION_ACTIVE
Responsibilityworkflows, templates, workflow responsesRust targetworkflows-templatesPriorityP4Next stepdefer; not on critical runner/mailbox pathRiskworkflow compatibility and template migration
21. runtime/operator/Norcx UNKNOWN_NEEDS_AUDIT
Responsibilityoperator/runtime integration and agent process controlRust targetruntime-operator-norcxPriorityP4Next stepoperator-specific audit before changesRiskprocess lifecycle side effects
22. observability/health/SigNoZ/OTel PYTHON_PRODUCTION_ACTIVE
Responsibilityhealth, readiness, metrics, logs, observability statusRust targetshared-observabilityPriorityP1 supportNext steptreat Rust /health and /readiness as liveness; DB-write readiness must come from canary evidenceRiskfalse readiness or telemetry drift
23. background jobs/schedulers DO_NOT_TOUCH_WITHOUT_AUDIT
Responsibilitycleanup, sync, pollers, watchdogs, hidden write pathsRust targetseparate job modules after auditPriorityP4Next stepinventory side effects before editingRiskhidden production writes
24. release/deploy/static frontend PYTHON_PRODUCTION_ACTIVE
Responsibilityrelease dirs, static assets, frontend serving, operational handoffRust targetnonePriorityP0 operationalNext stepconfirm systemd WorkingDirectory before any release workRiskserving stale or wrong frontend
25. unknown/needs classification UNKNOWN_NEEDS_AUDIT
Responsibilitylegacy helpers and less-used admin/debug endpointsRust targetunknownPriorityaudit as neededNext stepadd owner before moving or deletingRiskhidden side effects and implicit coupling

4. Python vs Rust Migration Matrix

ItemPython atualRust atualStatusFalta para finalizar
/api/mailbox/pullproduction active for pull/read semanticsglobal Rust core mutation gate active; pull/read path remains Python-led because READ_PATH=falsePYTHON_PRODUCTION_ACTIVEdo not mark pull/read globally moved without separate QA gate
/api/cli/sendproduction active for routing/chat/stream orchestration and rollback paritycommand-mailbox core runtime active globally in Rust with COMMAND_MAILBOX_RUST_GLOBAL_ENABLED=true and fallback OFFRUST_PARTIAL_SCOPE_VALIDATEDmonitor global release before deleting residual Python source
/api/mailbox/ackrollback/source parity retainedglobal Rust ACK core mutation active; off-old-allowlist probe validated ACK without duplicationRUST_PARTIAL_SCOPE_VALIDATEDcontinue rollback preservation until residual Python source-removal gate
/api/runner/commands/nextproduction activecommand write path covered by Rust core tests; next/claim read path not globally movedPYTHON_PRODUCTION_ACTIVEseparate read/claim canary before moving this path
/api/runner/commands/{id}/statusproduction active, RBAC fixed locallynot part of the approved partial command-mailbox decommissionPYTHON_PRODUCTION_ACTIVEseparate status/result migration gate required
/api/runner/commands/{id}/logsproduction activenot part of the approved partial command-mailbox decommissionPYTHON_PRODUCTION_ACTIVElog redaction and compatibility tests before any Rust path
/api/runner/exec/resultproduction activenot part of the approved partial command-mailbox decommissionPYTHON_PRODUCTION_ACTIVEresult write canary and rollback package required
/api/admin/auto_coordinator/startproduction active after coordinator hotfix chainroute decision support onlyRUST_NOT_STARTEDruntime service shadow and canary
/api/runner/projects/apply_session_projectproduction active with stale-session safeguardscommand result support onlyRUST_NOT_STARTEDproject binding and artifact contract tests
/api/runner/configproduction active, device-aware hotfixsnapshots cover regressionRUST_NOT_STARTEDrunner-device registry audit
/api/runner/syncproduction active, project overwrite guardedsnapshots cover related riskRUST_NOT_STARTEDsync contract and project binding module
QR pickupproduction active, hotfixedno Rust moduleRUST_NOT_STARTEDapproval state machine audit
runner friendly/display nameproduction active, hotfixedno Rust moduleRUST_NOT_STARTEDrunner-device display contract
frontend stale session guardfrontend hotfix in published source chainno Rust modulePYTHON_HOTFIXEDfrontend regression tests remain tracked separately
memory_plane_servicePython adapters integrate with memory plane schema/contractsRust crate shipped in active releaseRUST_PARTIAL_PRODUCTION RUST_EMBEDDED_RUNTIMEfocused memory plane audit before flag/schema/read changes
memory plan/planning promotionPython planning/session flow can materialize/compare memory planeRust-defined memory plane domain through Python adapter runtimeRUST_PARTIAL_PRODUCTIONverify active flags, shadow-read behavior, and primary store boundary

5. Rust Modules Already in Production / Production Adjacent

ModuleStatusExecutionSourceNotes
memory_plane_serviceRUST_PARTIAL_PRODUCTION RUST_EMBEDDED_RUNTIMEvia Python adapters/contracts; no daemon foundservices/memory_plane_rust/ and backend/api/src/memory_plane_shadow*.pyRust crate/schema shipped in active release; Python remains observed runtime entrypoint.
memory plan / planning promotionRUST_PARTIAL_PRODUCTIONPython planning flow backed by memory plane domainbackend/app.py, memory plane adapters, Rust memory plane cratePlanning/inicial project memory flow; not a separate Rust service.
runner-command-mailbox-serviceRUST_PARTIAL_SCOPE_VALIDATEDloopback service on port 8099; global command-mailbox core runtime active with fallback OFFrust-services/ and backend/api/src/command_mailbox_runtime_switch.pyGLOBAL_COMMAND_MAILBOX_RUST_ACTIVE_OK passed on release b4f3d061; COMMAND_MAILBOX_RUST_GLOBAL_ENABLED=true, NAFQAX_RUST_GLOBAL_SCOPE_ENABLED=1, CANARY_ONLY=0, AUTHORITATIVE=0, DUAL_WRITE=0, and READ_PATH=false.

6. Runner Command Mailbox Checklist

7. Definition of Done

8. Risks / Blockers

9. Next Recommended Actions

  1. Manter evidencia /var/backups/nafqax-agent-infra/command-mailbox-partial-decommission-20260504T035517Z/, /var/backups/nafqax-agent-infra/progressive-scope2-unsetenv-fix-20260506T054128Z, /var/backups/nafqax-agent-infra/small-batch-decomposition-20260506T060953Z, /var/backups/nafqax-agent-infra/fallback-removal-canary-20260506T070952Z, /var/backups/nafqax-agent-infra/minimal-fallback-off-20260506T074019Z, /var/backups/nafqax-agent-infra/rust-command-mailbox-scoped-5-fallback-off-20260506T075828Z, /var/backups/nafqax-agent-infra/command-mailbox-six-item-finalization-20260506T081250Z, /var/backups/nafqax-agent-infra/command-mailbox-next-batch-scoped-expansion-20260506T083640Z, /var/backups/nafqax-agent-infra/command-mailbox-finalization-round1-20260507T015645Z, /var/backups/nafqax-agent-infra/command-mailbox-finalization-round2-20260507T020429Z, /var/backups/nafqax-agent-infra/command-mailbox-finalization-round3-20260507T021345Z, /var/backups/nafqax-agent-infra/command-mailbox-finalization-round4-20260507T021459Z, /var/backups/nafqax-agent-infra/command-mailbox-finalization-round5-20260507T022134Z, /var/backups/nafqax-agent-infra/command-mailbox-lazy-python-handlers-active-minimal-20260507T061802Z e /var/backups/nafqax-agent-infra/command-mailbox-global-rust-active-20260507T070039Z.
  2. Reexecutar E2E coordinator task infra/audit por caminho autenticado contra o release f385dcb6; a tentativa autenticada em 92cfc573 criou task infra e executou o agente, mas ficou bloqueada por task_output sem ack/finalizacao, com evidencia em /var/backups/nafqax-agent-infra/coordinator-task-e2e-92cfc573-online-20260509T171122Z.
  3. Continuar exigindo matriz completa somente quando uma evidencia ficar ambigua.
  4. Nao ativar Rust authoritative, dual-write global ou remocao total de Python sem nova janela aprovada.
  5. Manter rollback por flag para Python-only e preservar ledger/timeline.