Investigations, field notes, and occasional opinions.
A 13-month-old hashing default in LlamaIndex silently re-embeds byte-identical content. The bug fires on half the supported storage backends today; the other half is one upstream commit away from activating.
AI coding agents burn tokens reconstructing codebase structure on every session: grep, read, grep again, piece together what a symbol graph already knows. I built CodeGraph, a local MCP server that parses a repo with tree-sitter and exposes a pre-computed call graph to Claude Code through six tools. I use it daily to keep Claude Code token bills sane. A headless benchmark on a 484-file FastAPI stack measured a 59% drop in tokens, a 60% drop in turns, and 82 seconds less wall time per investigation, with file-level recall held at 100%.
I built ChatGPT LightSession to fix a slowdown in my own long threads. Sixty thousand people have installed it since, and the number keeps climbing. The extension trims the conversation JSON on the way in, keeping long sessions responsive without touching a server.