Forge Architecture Debate — All-Model Convergence

4 models (Grok CTO, Gemini, Opus, Codex) debating the architecture for Forge as a meta-orchestrator. Updated: April 1, 2026

70%
System Built
4
Products Managed
6
Critical Gaps Found
2
Repos (antilles + lottery)
5
Priority Items

The Architecture (Converged)

YOU (Telegram / CLI) │ ▼ FORGE (Meta-Orchestrator) ├── Hermes (brain, supervises all tracks event-driven) ├── Memory Router (8 providers, CXDB, hermes-memory) ├── API Bridge (connects OpenClaw ↔ Memory ↔ Toolkit) ├── Research Harness (Swarma, MiroShark, Evals, Auto-research) │ ├── tracks.yaml (maps tracks to product repos + branches) │ ├── rmt-reputation → antilles-v2 / feature/rmt-core-porting │ ├── identity → antilles-v2 / feature/identity-foundation │ ├── x402 → antilles-v2 / agent/x402-trust-channels │ └── lottery → lottery / main │ └── Per-Track Instances: ├── Research State Machine (STRATEGY → EXPERIMENT → VERDICT) ├── Sprint Dev Pipeline (SEED → BUILD → VALIDATE → COMMIT) ├── Eval Matrix (5-dimensional: Scenario×Actor×Scale×Condition×Metric) └── Working Memory (synced to product repo docs/) │ ▼ PRODUCT REPOS (antilles-v2, lottery, future...) ├── Receive PRs from Forge sprint pipeline ├── Receive working memory summaries ├── Deploy to testnet/mainnet via Forge commands └── Stay clean — only production-ready code

The Flow: Research → Ship

1. Research Phase

You say: "optimize sybil detection"

Forge runs: Swarma loops, MiroShark sims, auto-research, evals

Findings stay in Forge (CXDB, hermes-memory)

You get: Telegram notification with narrative summary

2. Approval Gate

Hermes: "Detection improved 93.7%→99.2%. Approve build?"

You tap: [Approve] on Telegram

Grok CTO validates the finding

3. Sprint Pipeline

Forge creates worktree of antilles-v2 branch

Code written in that worktree (not in Forge)

Tests run, cross-model review

PR submitted to antilles-v2

4. Deploy

You say: "ship rmt testnet"

Forge deploys from antilles-v2 to Sepolia

Health checks, verification

Dashboard updated with deployment status

4-Model Debate: Critical Gaps Found

GROK GEMINI Both Found

GapRiskProposed FixStatus
1. Concurrency LockingHIGH — Two tracks modify same file → conflictLock manager or serialization queueNEEDS DESIGN
2. OpenClaw CoexistenceMEDIUM — Parallel systems, race conditionsMigrate OpenClaw dispatch into Forge OR make it pure relayNEEDS DECISION
3. External Repo UpdatesMEDIUM — Manual commits make Forge staleGit webhooks or periodic syncNEEDS DESIGN
4. CI/CD IntegrationMEDIUM — Unclear if Forge replaces GH ActionsForge creates PRs, GH Actions handles post-mergeNEEDS DECISION
5. Security/ApprovalsHIGH — No RBAC for production deploysRole-based access + multi-party approvalNEEDS DESIGN
6. Cost SustainabilityMEDIUM — Research loops could overrunPaperclip thresholds + auto-scalingPARTIAL (Paperclip exists)

OPUS Additional Concerns

  • Research harness has stale copy of production algorithms — needs submodule or workspace reference
  • 47 ARCH findings blocking pipeline — fix in antilles-v2, not forge
  • Hermes profiles have 0 memories — event-driven supervision not wired
  • 90-day acceptance window too long for bots — needs redesign

CODEX Awaiting Response...

Codex is processing — will update when response arrives

Implementation Priority

P0: tracks.yaml Enhancement (4 hours)

Add repo + branch fields. Scope dispatch-review per-track. Foundation for everything else.

P0: Hermes Event-Driven Supervision (8 hours)

HermesSupervisorHandler in Memory Router event bus. Replaces cron-based approach. Proactive Telegram alerts.

P1: Memory Bridge (4 hours)

Research findings → target repo WORKING_MEMORY.md automatically. Curated summaries, not raw data.

P1: Test Deploy Pipeline (8 hours)

VALIDATE stage creates worktree of target repo, runs tests + testnet deploy.

P2: 47 ARCH Findings (4-8 hours)

Fix in antilles-v2 where they belong. Hardcoded values → config, missing error handling → try/catch.

What's Built vs What's Missing

ComponentStatusDetails
Forge CLIBUILT30+ commands including new-track (built today)
Telegram Dispatch (OpenClaw)WORKINGRelays to/from Telegram, creates worktrees
Hermes (6 profiles)PARTIALRunning but passive — needs event-driven supervision
Memory RouterWORKING8 providers, event bus, 17 hermes-memory entries
Research Loops (4 tracks)RUNNINGSwarma 4-model rotation, all tracks polling
RMT Contracts + OracleBUILT11 contracts, 34 oracle files, 92 tests
Identity PortalBUILT304 commits, 150+ React components, full backend
RMT DashboardBUILT60 React components with D3.js graph viz
ERC-8004 FetcherBUILTNeeds Base mainnet config only
tracks.yaml ScopingTODONeeds repo + branch fields + dispatch-review scoping
HermesSupervisorHandlerTODOSeed brief created, needs sprint pipeline
Memory Bridge to ReposTODOscanPaths added today, handler needed
Test Deploy PipelineTODOVALIDATE stage needs worktree-into-target-repo
Multi-Chain IndexerTODOHelixaApiFetcher + ERC8183Fetcher needed (~220 LOC)
Eval Matrix (5-dim)TODOFramework designed, needs eval-matrix.ts (~300 LOC)
Concurrency Lock ManagerNOT DESIGNEDCritical gap — two tracks same repo

Competitive Landscape (Today's Research)

CompetitorThreatAgentsOur Response
ERC-8004 ecosystemHIGH130K+We complete it (sybil oracle + trust delegation)
ERC-8183 + VirtualsHIGH18KOur hooks add trust gating
Helixa (CredOracle)MED-HIGH132K indexedCentralized, no sybil detection, cap at 50
Cred ProtocolMED200M addrCredit scoring only
FairScaleDEADTreasury liquidated

Today's Research Session Summary

Full-day strategic research produced:

Forge Architecture Debate Dashboard — Generated April 1, 2026 | Research Dashboard | Execution Plan | Eval Framework