AI Engineer · RAG / Agent Systems

RAG Agents
to Production AI.

LangGraph RAG, multi-agent orchestration, FastAPI model serving,
and data pipelines built into working products.

5 Projects · LangGraph RAG · FastAPI + Docker · Model Serving + Data Pipelines

For resume reviewers

What this portfolio proves at a glance

RAG / Agent Systems

LangGraph 8-node RAG, FAISS/BM25 hybrid retrieval, Cross-Encoder reranking, RAGAS evaluation, and Supervisor-style multi-agent workflows.

AI Backend / Serving

Python/FastAPI APIs, SSE streaming, Docker deployment, model hot reload, MLflow registry, and production-oriented observability.

Applied Product Work

Turns LLM output into tested features: RAG chat, manuscript editing agents, PDF translation, AI coaching, and trading inference services.

01 — RAG System
01
2024 – 2026

ChatBout AI

LangGraph RAG + Multi-Agent System

System evidence

ChatBout AI

RAG system: classify → query transform → retrieve → rerank → grade → generate → hallucination check.

Search quality: FAISS/BM25 hybrid retrieval, HyDE, multi-query expansion, Cross-Encoder reranking, RAGAS metrics.

Agent orchestration: LangGraph Supervisor + 4 workers, multi-hop chaining, FastAPI endpoints, Docker deployment.

RAG Nodes

8

Agents

4

API Routes

11

A production-oriented RAG backend that turns documents into grounded answers through an 8-node LangGraph workflow. The graph classifies the request, rewrites queries, retrieves from hybrid search, reranks candidates, grades relevance, generates the answer, and runs a hallucination check before returning the response. A second LangGraph supervisor routes complex requests across specialized RAG, Code, Analysis, and Chitchat workers, then aggregates the result. The system is exposed through FastAPI endpoints with SSE streaming, Docker deployment, RAGAS evaluation, and LangSmith tracing.

Reviewer quick read

  • RAG system: classify → query transform → retrieve → rerank → grade → generate → hallucination check.
  • Search quality: FAISS/BM25 hybrid retrieval, HyDE, multi-query expansion, Cross-Encoder reranking, RAGAS metrics.
  • Agent orchestration: LangGraph Supervisor + 4 workers, multi-hop chaining, FastAPI endpoints, Docker deployment.
8-node LangGraph StateGraph with conditional routing and Self-RAG retry loops
FAISS/BM25/Hybrid retrievers with CacheBackedEmbeddings, MMR, HyDE, and 3-way multi-query expansion
Cross-Encoder reranking using ms-marco-MiniLM to reorder top-k retrieval candidates
LangGraph Supervisor pattern with RAG, Code, Analysis, and Chitchat workers
RAGAS evaluation: faithfulness, answer relevancy, context precision, context recall
Gold-standard Precision/Recall@k regression checks and LangSmith tracing
11 FastAPI endpoints including health, metrics, evaluate, bandit, invoke, and SSE chat paths
Docker deployment with embedding cache volume and service-oriented API boundaries
Python FastAPI LangGraph LangChain Haystack FAISS BM25 RAGAS LangSmith MongoDB Docker Claude OpenAI
02 — Agent Workflow
02
2024 – 2026

Book Writer Agent

LangGraph/RAG Manuscript Editing Workflow

System evidence

Book Writer Agent

Agent workflow: outline → research → writer → editor → code review; RAG research → section edit → validation.

Vector retrieval: ChromaDB + ko-sroberta style search with file-hash incremental indexing.

Production guardrails: code/header masking, revised-block extraction, quality checks, fallback recovery.

Books

2

Chapters

45

Pipeline

9-step

A manuscript editing agent built to turn English-first or translation-heavy drafts into natural Korean technical prose. The workflow uses LangGraph for staged writing and revision, ChromaDB for style retrieval, and SentenceTransformers for semantic search over reference books and blog samples. It preserves code blocks and headings, validates revised sections, detects length collapse or meta text leakage, and falls back to source text when revision quality fails. The pipeline was used on two technical books across Tauri 2 desktop app development and Python quant trading AI.

Reviewer quick read

  • Agent workflow: outline → research → writer → editor → code review; RAG research → section edit → validation.
  • Vector retrieval: ChromaDB + ko-sroberta style search with file-hash incremental indexing.
  • Production guardrails: code/header masking, revised-block extraction, quality checks, fallback recovery.
LangGraph workflow for draft generation, research, editing, and validation
ChromaDB + jhgan/ko-sroberta-multitask for Korean technical style retrieval
Topic retrieval and writing-style retrieval combined into prompt-time context
9-step revision pipeline: literal translation removal, flow review, terminology unification, beginner explanation, polish
Code block and heading masking to keep technical structure stable during LLM rewriting
Fallback logic for missing code blocks, meta text, invalid revised tags, or severe length drop
Applied to 18 chapters of a Tauri 2 app book and 27 chapters of a Python quant trading AI book
Python LangGraph ChromaDB RAG Claude SentenceTransformers Markdown
03 — Model Serving
03
2022 – Present

Stock Trading AI

Statistical Arbitrage System · SAC RL · Live on IBKR

OOS Sharpe

3.716

Ann. Return

+71.5%

IBKR Live Pairs

32

An ML-driven stat-arb system, designed and operated solo — from alpha research through live execution on IBKR. The core problem with statistical arbitrage is regime dependency: cointegration that holds in mean-reverting conditions deteriorates in trending regimes, causing pair spreads to diverge without reversion. An HMM regime classifier detects the current market state in real time and routes signals to the appropriate strategy branch. A SAC RL agent handles position sizing — its entropy-maximizing objective scales exposure with predicted signal strength, automatically cutting risk when conviction is low. Entry and exit signals are generated by an XGBoost / LightGBM / CatBoost ensemble and a PyTorch TFT sequence model, trained on multi-timeframe features from FMP, FRED, yfinance, and Alpha Vantage. QuestDB stores 10 years of 5-minute bars for time-series queries, with DuckDB used for analytical feature work. Live orders execute on IBKR via ib-async.

Reviewer quick read

  • Finance system: market-data ingestion, QuestDB 5-minute bars, feature store, signal generation, broker API, and execution loop.
  • AI/RL: SAC/PPO/QR-DQN experiments, MLflow model management, FastAPI inference, CVaR sizing.
  • Operating proof: 32 live IBKR pairs, real-time P&L monitoring, Dockerized service layers.
OOS Sharpe 3.716 · Ann. Return +71.5% vs SPY benchmark +11.7%
HMM regime classifier feeds a strategy router — different signal logic per regime state
SAC RL agent drives position sizing — entropy-maximizing policy scales exposure with signal conviction, not fixed rules
Ensemble (XGBoost + LightGBM + CatBoost) + PyTorch TFT generate entry/exit signals; Optuna hyperparam sweep per model
Data pipeline: FMP + FRED + yfinance + Alpha Vantage → QuestDB 10 years of 5-minute bars + DuckDB feature work → model training & inference
FastAPI service layer: MongoDB (Beanie ODM) for trade records, Redis for cache, DuckDB for analytical queries — each layer purpose-fit
Live execution on IBKR — 32 active stat-arb pairs, real-time order management and P&L tracking via ib-async
Rust TUI (Ratatui + Tokio) for terminal monitoring; Tauri 2 + SvelteKit desktop dashboard
04 — LLM App
04
2023

ReadBooks.ai

LLM-powered PDF Translation Desktop App

Languages

30+

LLM Backend

Claude

Platform

Native

A Tauri 2 native desktop app for reading English technical books in your native language, paragraph by paragraph. The original approach attempted to fine-tune T5 and fairseq models directly on Korean–English pairs — this was abandoned when the Korean training corpus proved too thin, producing token-level noise instead of coherent output. The architecture was rebuilt around Claude Haiku via the Anthropic API, with the Rust backend handling all network I/O through reqwest, while pdfjs-dist on the SvelteKit frontend extracts and segments paragraph-level text blocks from PDF files. An SSE-streamed Ask AI panel lets users ask questions mid-read, injecting the current page text as context so answers are always relevant to what's on screen.

Reviewer quick read

  • Native desktop app: Tauri 2 + Rust backend + SvelteKit frontend.
  • LLM product flow: PDF parsing, paragraph segmentation, Claude API translation, Ask AI streaming.
  • Engineering judgment: tried direct model training first, then switched to API when data quality was the bottleneck.
pdfjs-dist parses PDF structure and extracts text at paragraph granularity
Rust backend (reqwest + tokio) calls Claude Haiku API with async concurrency
30+ language support — translation target is user-configurable at runtime
SSE streaming delivers Ask AI responses token-by-token for low-latency feel
Failure-driven pivot: T5/fairseq fine-tune failed → insufficient Korean data → Claude API
Fully offline-capable except for API calls; no server, no account required beyond API key
05 — AI App
05
2024

Mandai

Goal Management Desktop App with Built-in AI Coach

LLM APIs

2

Methodology

3-in-1

Storage

Local-first

A Tauri 2 desktop app that replaces three separate productivity tools — Mandala Chart, GTD, and Pomodoro — with one coherent workflow. The Mandala Chart gives goals a spatial structure: each outer cell expands into its own 3×3 action plan, with a drill-down navigator that moves through hierarchy levels. GTD state management runs as an explicit state machine in Rust, tracking items across Inbox → Next Actions → Waiting → Done with enforced transitions. Pomodoro sessions drive the focus cycle and write session data to DuckDB through the Rust backend, keeping everything local-first with no cloud dependency. An Ask AI feature injects the current goal and its context into an OpenAI/Anthropic prompt and streams the coaching response back via SSE.

Reviewer quick read

  • Local-first desktop system: Rust state machine, DuckDB storage, SvelteKit UI.
  • AI feature: OpenAI/Anthropic prompt flow with current goal context and SSE streaming.
  • Product design: combines Mandala Chart, GTD, and Pomodoro into one workflow.
3-in-1 workflow: Mandala Chart spatial hierarchy + GTD state machine + Pomodoro timer
Rust state machine enforces GTD transitions — no invalid state changes possible
Drill-down navigation: click any cell to expand its own 3×3 Mandala sub-plan
DuckDB via Rust backend — all data stays local, zero cloud dependency
AI coach: OpenAI/Anthropic prompt flow with current goal context
SSE-streamed Ask AI with goal context injection and GTD expert system prompt
04 — Writing
All posts ↗