ML / Quant Engineer · Writing & Code
Writing on reinforcement learning, MLOps, and quantitative systems — each post backed by runnable code and real benchmark numbers from a four-year solo build of a SAC pair-trading system.
More posts
Why expected Q-values aren't enough, and what 51 quantiles get you that SAC can't.
Prioritized Experience Replay in 210 lines of numpy. No dependencies, O(log N) sampling, full integration with twin-critic SAC.
Why 'we chose SAC' is a better answer than 'we used SAC.' Building a 3-seed comparison that survives an interview.
An opt-in tracker that's a no-op without config, a Postgres-backed deployment, and a measured CPU latency win from 10 lines of actor changes.
A tiny serving layer for RL agents that picks up new training runs without a restart. No watchers, no pub/sub, no Kubernetes.
Statistical Arbitrage System · SAC RL · Live on IBKR
OOS Sharpe
3.716
Ann. Return
+71.5%
IBKR Live Pairs
32
An ML-driven stat-arb system, designed and operated solo — from alpha research through live execution on IBKR. The core problem with statistical arbitrage is regime dependency: cointegration that holds in mean-reverting conditions deteriorates in trending regimes, causing pair spreads to diverge without reversion. An HMM regime classifier detects the current market state in real time and routes signals to the appropriate strategy branch. A SAC RL agent handles position sizing — its entropy-maximizing objective scales exposure with predicted signal strength, automatically cutting risk when conviction is low. Entry and exit signals are generated by an XGBoost / LightGBM / CatBoost ensemble and a PyTorch TFT sequence model, trained on multi-timeframe features from FMP, FRED, yfinance, and Alpha Vantage, stored in a DuckDB feature store. Live orders execute on IBKR via ib-async.
LLM-powered PDF Translation Desktop App
Languages
30+
LLM Backend
Claude
Platform
Native
A Tauri 2 native desktop app for reading English technical books in your native language, paragraph by paragraph. The original approach attempted to fine-tune T5 and fairseq models directly on Korean–English pairs — this was abandoned when the Korean training corpus proved too thin, producing token-level noise instead of coherent output. The architecture was rebuilt around Claude Haiku via the Anthropic API, with the Rust backend handling all network I/O through reqwest, while pdfjs-dist on the SvelteKit frontend extracts and segments paragraph-level text blocks from PDF files. An SSE-streamed Ask AI panel lets users ask questions mid-read, injecting the current page text as context so answers are always relevant to what's on screen.
Goal Management Desktop App with Built-in AI Coach
AI Providers
4
Methodology
3-in-1
Storage
Local-first
A Tauri 2 desktop app that replaces three separate productivity tools — Mandala Chart, GTD, and Pomodoro — with one coherent workflow. The Mandala Chart gives goals a spatial structure: each outer cell expands into its own 3×3 action plan, with a drill-down navigator that moves through hierarchy levels. GTD state management runs as an explicit state machine in Rust, tracking items across Inbox → Next Actions → Waiting → Done with enforced transitions. Pomodoro sessions drive the focus cycle and write session data to DuckDB through the Rust backend, keeping everything local-first with no cloud dependency. An Ask AI feature injects the current goal and its context into a configurable LLM prompt — supporting OpenAI, Anthropic, Gemini, and Groq — and streams the coaching response back via SSE.