Building RAG in Laravel: Four Ingestion Bugs That Silently Wreck Retrieval

0 ▲

1 hour ago · Tech · 0 comments

Every Laravel RAG tutorial builds the same ingestion pipeline (chunk, embed, store) and stops the moment the agent answers on screen. None of them check whether retrieval is any good. But retrieval quality is decided at ingestion, before the model runs once, and four decisions there fail with no error, no exception, no failed test: Chunking that severs the answer mid-sentence, so answer@1 falls while source hit@1 still looks healthy. An HNSW index built with vector_l2_ops while you query with cosine <=>. Postgres silently ignores the index and scans every row. Laravel 13's native whereVectorSimilarTo() hardcodes <=>, so it's easier to hit than ever. Shown with EXPLAIN. The embedding dimension baked into the vector(1536) column type, so "shrink it to save storage" is a migration plus a full re-embed that quietly drops retrieval to 47%. Ingesting and querying with different models, which turns every distance into noise. Each bug is real code from a working repo, proven against an eval…

No comments yet. Log in to reply on the Fediverse. Comments will appear here.