Pulling Rank: Using Fused Retrieval to Bridge the Alias Gap
June 12, 2026
New to this series? Open for context
Engineering the Rules Oracle
Modern tabletop games rely on massive, highly fragmented ecosystems of rulebooks, supplements, and constantly updating FAQs. When an obscure rule interaction or edge case arises mid game, it takes players out of the fun. I am building the Rules Oracle to solve this: a hosted Q&A engine that provides cited, page-referenced answers to complex rules questions. This series will cover my thoughts and learnings on engineering the initial work on the Rules Oracle.
Access is currently invite only during the beta period.
- Part 1: In Good Character: Designing an Ingestion Pipeline for Hostile Tabletop Rules
- Part 2: Pulling Rank: Using Fused Retrieval to Bridge the Alias Gap ← you are here
- Part 3: Fielding Questions: Using JSON Constraints to Force Grounded Answers
- Part 4: Preventing RAG Regressions: Eval Harnesses and Production Gates
Part 1 gave me structurally honest chunks. Heading paths, icons in text, book page numbers. That fixed layout lies. It did not fix vocabulary lies.
Some of my first test questions came from a Discord server where people play the game. I scrolled back through old posts and pulled real questions players had already argued about. One of them:
Is there a specific base size for a Warlord on horse?
The answer lives in the Bases section. That section defines base sizes for Heroes on horses. It never says "Warlord." The Heroes section defines that a Warlord is a type of Hero. No single chunk contains both "Warlord" and "base size."
Single-lane semantic search kept returning Warlord stat blocks, special rules, lore. The Bases chunk never made the cut. That is a vocabulary problem, not a parsing problem. Let's call it the Alias Gap: the book defines a rule at one level of its taxonomy (Hero), the question names a specific type (Warlord), and one embedding of the question will not reliably connect them.
Why Vibes Are Not a Search Strategy
Dense embeddings cluster concepts that sound related. That helps until a rule lives under a parent category and the question names a subtype that section never spells out. Two failure modes showed up early.
The vocabulary mismatch. Warlord is not slang for Hero. Both words are in the rulebook. A Warlord is defined as a type of Hero. The Bases section just says Hero. I could fine-tune a custom embedding model on this ruleset. That path keeps charging: a new tuning run and full re-embed every FAQ, plus a self-hosted stack I would have to operate instead of calling a hosted API. I was not looking for recurring prices when architecture could answer the question cheaper. Fine-tuning is paying rent on your vectors every FAQ season. Multi-lane retrieval is fixing the call numbers once.
The supersession mismatch. Tabletop games ship FAQs and errata that replace old rules with new text. Wargames, board games, RPGs: same pattern. Replacement language often embeds far from the original. In direct testing I saw FAQ pairs around 0.59 cosine similarity: about as close as unrelated paragraphs from different books. Surfacing the zombie rule from the old edition is worse than returning nothing. That problem wants a different tool (I come back to it below).
The Fix: Two Embeddings for One Question
I did not start with a lookup table. I started with a simpler question: if I already know a Warlord is a Hero, what would I embed to find the Bases section?
That became two dense lanes plus lexical search, fused at the end.
Expanded dense keeps the player's words and adds a nudge toward the category:
Is there a specific base size for a Warlord on horse? [warlord (hero)]
The vector stays anchored to "Warlord." This lane is good at chunks that mention Warlord directly or state that a Warlord is a Hero.
Canonical dense replaces the alias with the book's category term:
Is there a specific base size for a hero on horse?
This lane reaches chunks written about Heroes, including Bases, which never mentions Warlord at all. That was the chunk single-lane retrieval kept missing.
Lexical runs Postgres full-text search on the original question. Embeddings flatten rare tokens into neighborhood averages. Lexical search still cares that the user typed a faction name or ability title exactly as printed.
Without the canonical lane, Bases scores too low because "Warlord" means nothing in that text. Without the expanded lane, I can miss the definitional bridge chunk. I run all three in parallel, then merge the ranked lists.
Each search lane skips chunks marked is_superseded = true. That flag goes on the old rule text in the parent book. The FAQ or errata document is ingested separately, with the replacement wording, and stays searchable. I am not throwing away the update. I am hiding the zombie.
After fusion I boost core rulebook hits (expansion PDFs outnumber core and bury it otherwise), then take the top ten into context.
Fusing with Reciprocal Rank Fusion
Three lists come back. I merge them with Reciprocal Rank Fusion (RRF): chunks that rank highly in multiple lists accumulate score. Consensus beats a single lane's confident misfire.
function rrf(lists: ChunkRow[][], k = 60): { chunk: ChunkRow; score: number }[] {
const scores = new Map<string, number>()
const chunkById = new Map<string, ChunkRow>()
for (const list of lists) {
list.forEach((chunk, rank) => {
scores.set(chunk.id, (scores.get(chunk.id) ?? 0) + 1 / (k + rank + 1))
chunkById.set(chunk.id, chunk)
})
}
return [...scores.entries()]
.sort((a, b) => b[1] - a[1])
.map(([id, score]) => ({ chunk: chunkById.get(id)!, score }))
}
k = 60 dampens rank so first place in one lane cannot automatically crush a chunk that placed second or third in all three. For the Warlord question, RRF is what lets Bases and Heroes show up in the same context window.
Generalizing with an Alias Table
Hand-authoring [warlord (hero)] strings does not scale across game systems. Once the three-lane shape worked, I automated the substitutions.
After a rulebook is ingested, an extraction pass runs over its chunks (pnpm extract-aliases). An LLM proposes alias relationships (warlord → hero, horse → mount). Each row needs an evidence quote that appears verbatim in a source chunk before I insert it into concept_aliases. That table is part of the ingest pipeline, not something I maintain by hand at query time.
At query time, alias matches rewrite the embedding strings only. The player's question stays unchanged in the answer prompt. When no alias matches, expanded and canonical collapse to the same text and I pay for one embedding instead of two.
The table did not solve the Warlord case. The canonical lane did. The table makes the next alias gap cheaper to close without redesigning retrieval.
Supersession at Ingest Time
FAQ replacements are the second failure mode from above. My bias is to fix problems at ingest when I can. Ingest runs once per document. Queries run all night at the table. Supersession fits that shape: resolve which parent chunk is dead during FAQ ingest, store the replacement in the FAQ document itself, then filter zombies at query time with a boolean flag.
The matching logic stays game-agnostic: section names, page references, text search, vector fallback, LLM confirm. Nothing hard-coded to one game's unit list. Any tabletop corpus with errata should be able to ride the same path.
When an FAQ or errata document lands, I try to find the parent rule it replaces, in order:
- Section name matched against the parent chunk's heading path.
- Ability name searched in parent chunk text (for corrections buried inside a larger section).
- Page reference parsed from the FAQ heading or body, matched against parent page ranges.
- Vector similarity at threshold 0.60, only as a last resort.
Candidates merge and dedupe. A small LLM call confirms each pair. On confirmation I set is_superseded = true on the parent chunk only. The FAQ chunk that carries the new wording stays in the index. At query time every lane filters out superseded rows, so retrieval surfaces the replacement text instead of the zombie rule.
Conclusion
By the time retrieval finishes, I have ten chunks that survived multi-lane fusion and core boosting, with outdated parent rules filtered out but FAQ replacements still in the pool. Structural Truth from Part 1 still lives inside each chunk. This layer makes sure the right chunks arrive.
Getting context into the model is only half the work. The other half is stopping it from paraphrasing those chunks into fiction.
In Part 3: Prompting, I cover the JSON field ordering that forces the model to quote before it answers.