Part 1
In Good Character: Designing an Ingestion Pipeline for Hostile Tabletop Rules
June 10, 2026
Tabletop rulebooks are hostile to naive RAG: scanned PDFs, custom symbols, dense tables, and no reliable text layer. I used that mix as a validation corpus and built ingest for Structural Truth—heading paths, icon meanings in text, and book page numbers—by pivoting from unpdf and Document AI to Claude Haiku vision parse. Per-page and book-level caching makes re-chunking cheap; citation integrity belongs at the front of the pipeline.