SuperintelA Personal Quantitative Trading Platform — Ports, Adapters & a 22-Model Ensemble
A solo-built, full-lifecycle statistical arbitrage platform connecting market data ingestion, ML prediction, risk gating, human approval, dual-broker execution, and real-time monitoring into one maintainable system — held together by strict hexagonal architecture so each layer can change without rewriting the others.
The Real Problem: Maintainability at Trading-System Scale
The problem was not "build a pairs trading bot." The problem was: build a platform where each step of the trading lifecycle can change independently. Swap the database. Add a data source. Modify the model ensemble. Change execution venues. Do any of these without rewriting core logic.
This is non-trivial because trading systems combine heterogeneous external dependencies (brokers, market data providers, LLM APIs), time-series data at scale, and operational scheduling across multiple cadences — all with correctness constraints that must hold under partial failure.
The architectural answer: strict hexagonal boundaries. Domain services define what they need via ports (interfaces). Infrastructure provides adapters that implement those ports. No SQLAlchemy, Redis, or HTTP client references inside domain code. Ever.
Design Constraints
Scope honesty: Superintel is a personal platform with production-grade architectural seams. It is not a hardened multi-user SaaS. Known gaps — optional JWT auth, unauthenticated WebSockets — are documented and have a concrete hardening plan.
Stack & Infrastructure
| Layer | Technology |
|---|---|
| Backend runtime | FastAPI (Python 3.11), Pydantic v2, async/await, DI factory pattern |
| Domain architecture | DDD / hexagonal — domain services + ports, adapters for every external concern |
| Primary database | PostgreSQL 14 + TimescaleDB — 40 tables, 20 hypertables with compression + retention |
| Caching / rate-limiting | Redis 7 — L2 cache and per-route rate limiting |
| Frontend | React + TypeScript, WebSocket real-time feeds |
| Market data | YFinance, Binance (CCXT), FMP economic calendar, Coinglass, NewsAPI |
| AI / LLM layer | Perplexity (sentiment), OpenAI (audit) — both behind port abstractions |
| Broker execution | IBKR (Docker gateway), Binance via CCXT, Composite routing adapter |
| Observability | Prometheus metrics, health endpoints, structured logging |
| Infrastructure | Docker Compose (IBKR gateway, Postgres, Redis) |
Hexagonal Architecture — Full System
API routers call domain services. Domain services depend only on port interfaces. Infrastructure adapters implement those ports and are selected at runtime via environment variables — enabling memory/mock mode for testing with zero domain changes.
Adapter Selection via Environment
Every port has multiple adapter implementations. The factory reads an environment variable and wires the correct adapter — postgres for production, memory for unit tests,mock for CI. Domain code never knows which is active.
This pattern is applied consistently: *_REPO_IMPL, CACHE_IMPL, HTTP_CLIENT_IMPL, BROKER_IMPL. The DDD compliance test suite validates that swapping adapters via config produces identical domain behaviour.
postgres
Production
memory
Unit Tests
mock
CI / CD
1def get_price_data_repository(2 db_client=None,3) -> PriceDataRepositoryPort:4 """5 Domain code calls this and receives a PriceDataRepositoryPort.6 It never sees Postgres, Redis, or any infrastructure symbol.7 """8 impl = os.getenv("PRICE_DATA_REPO_IMPL", "postgres")910 match impl:11 case "postgres":12 return PostgresPriceDataRepository(13 db_client or get_db_client()14 )15 case "memory":16 return InMemoryPriceDataRepository()17 case "mock":18 return MockPriceDataRepository()19 case _:20 raise ValueError(f"Unknown adapter: {impl}")212223# DDD compliance test — validates the contract holds24def test_adapter_swap_produces_same_contract():25 for impl in ["postgres", "memory", "mock"]:26 os.environ["PRICE_DATA_REPO_IMPL"] = impl27 repo = get_price_data_repository()28 assert isinstance(repo, PriceDataRepositoryPort)29 assert hasattr(repo, "get_prices")30 assert hasattr(repo, "write_prices")
End-to-End Signal-to-Fill Sequence
Every trade traverses: ingestion → feature store → ML ensemble → approval gate → risk veto → broker execution → journal write → WebSocket broadcast. No step is skippable — correctness is structural, not optional.
Composite Broker Gateway
A single BrokerPort interface hides all broker complexity from trading logic. The composite adapter inspectsPairOrder.asset_type and routes to the correct provider — IBKR for equities, Binance for crypto.
Upstream services submit one order type. The gateway handles routing, retry, and error normalisation. Adding a third venue requires one new adapter class and a routing rule — zero changes in domain code.
1class CompositeBrokerAdapter(BrokerPort):2 """3 Routes orders by asset_type.4 Upstream code only sees BrokerPort — no IBKR/Binance imports.5 """6 def __init__(7 self,8 ibkr: IBKRAdapter,9 binance: BinanceAdapter,10 ):11 self._ibkr = ibkr12 self._binance = binance1314 async def submit_order(15 self, order: PairOrder16 ) -> OrderResult:17 match order.asset_type:18 case AssetType.EQUITY:19 return await self._ibkr.submit_order(order)20 case AssetType.CRYPTO:21 return await self._binance.submit_order(order)22 case _:23 raise UnsupportedAssetTypeError(24 order.asset_type25 )2627 async def get_positions(self) -> list[Position]:28 equity = await self._ibkr.get_positions()29 crypto = await self._binance.get_positions()30 return equity + crypto
Always-On Pipeline Architecture
Multiple cadences co-exist: crypto ingestion every 5 minutes, equity data hourly, EOD reconciliation + P&L snapshots, and weekly model training. Graceful degradation and freshness checks prevent stale data from reaching signal generation.
Every 5 min
Crypto ingest
Hourly
Equity ingest
Daily
EOD reconcile
Weekly
Model retrain
22-Model Prediction Ensemble
PredictionService dynamically loads model configurations from the ModelRegistry byrun_id. Each weekly training cycle records metrics and pushes a new run. The registry controls which run is active in production — rollback is a single config update.
Feature Store — TimescaleDB + JSONB
Feature vectors are stored as schema-versioned JSONB payloads in TimescaleDB hypertables. Compression and retention policies apply per-table. Every vector is tagged with aschema_version FK — the model loader validates schema compatibility before inference.
40
Total tables
20
Hypertables
24
Domain modules
2,692
Passing tests
2,692 Tests — Honest Coverage Breakdown
Coverage is high for domain logic where the hexagonal structure enables true isolation. Infrastructure adapters and integration paths have lower coverage. Reporting a single headline number would mislead — the breakdown is more informative.
Domain Logic
Unit — all in-memory adapter
Domain services and use-cases run against InMemory adapters. No DB, no network. Fast and deterministic.
DDD Compliance
Adapter swap contract tests
Explicit suite validates that every port implementation behaves identically — swapping postgres for memory changes nothing upstream.
Infrastructure / Integration
Integration + E2E (partial)
Adapter-level tests and full-stack E2E paths exist but are not exhaustive. Broker adapters and LLM paths rely on mocks.
Known Security Gaps & Concrete Hardening Plan
Personal-scope assumptions leave real security debt. These are documented, understood, and have specific remediation steps — not swept under a "future work" heading.
Optional JWT across routers
Hardening plan: Turn 'optional' into required middleware for all sensitive endpoints. Formalise roles and scopes at the router level. Add auth-required negative tests per router.
WebSocket lacks authentication
Hardening plan: Token validation on connect. Per-channel authorization. Replay and expiry handling. WS auth integration test suite.
Hardcoded approver identity
Hardening plan: Replace with a real approver model: users/roles, request ownership, and audit-trail integrity. Approval router currently has known issues documented in the codebase.
Provider API key hygiene
Hardening plan: Per-environment scoped keys, server-side-only storage, rotation policy, and least-privilege scoping for YFinance/FMP/OpenAI/Perplexity integrations.
Trade-offs & Rejected Alternatives
✓ Hexagonal boundaries over direct framework coupling
Faster to call SQLAlchemy/Redis directly from a service. But that makes testability expensive and infrastructure changes painful at 24-module scale. The complexity cost paid upfront enables long-term changeability without domain rewrites.
✓ Composite broker gateway over per-service broker calls
Each service knowing about IBKR/Binance would scatter routing logic. A single composite adapter centralises routing and error normalisation — trading logic stays broker-agnostic, which is critical when adding a third venue.
✓ Execution correctness over execution sophistication
TWAP/VWAP algorithms were explicitly deprioritised. The platform focuses on correctness gates — quality checks, approval vetoes, risk limits — from signal to order to journal write. A robust end-to-end loop beats sophisticated execution on an unreliable pipeline.
✓ DB-backed feature store over in-memory feature pipeline
In-memory pipelines lose state on restart. TimescaleDB hypertables with schema-versioned JSONB persist features durably, support compression/retention, and enable replay — important for weekly retraining across multi-year lookbacks.
Lessons & What I'd Do Differently
Authentication should be foundational, not additive
Starting with 'optional JWT' creates auth debt that grows with every new router. In the next system, auth middleware is wired first and every sensitive endpoint is auth-required by default, with explicit opt-out for public routes.
Domain coverage ≠ overall coverage — report both
A single headline test coverage number misleads when domain logic is well-tested but infrastructure adapters are not. The honest metric is domain coverage, infrastructure coverage, and total — all three, explained.
Operational scheduling complexity compounds
Five concurrent cadences (5 min, hourly, EOD, weekly, + on-demand) each with their own failure modes and fallbacks is a higher coordination surface than anticipated. Explicit circuit-breakers and a dead-letter pattern for failed jobs would be the first-class addition.
Key Architectural Insight
Hexagonal architecture at 24-module scale is not primarily a design problem — it is a discipline problem. Every new module has an opportunity to shortcut a port and import infrastructure directly. The DDD compliance test suite is what makes the boundary enforceable rather than aspirational.
Scope Positioning
Superintel is not presented as a production SaaS — it is a personal quantitative platform with production-grade architectural seams. The ports/adapters model, DI discipline, and test suite represent the engineering standard. The auth gaps represent personal-scope pragmatism that has a concrete remediation path.