Hacker News new | past | comments | ask | show | jobs | submit login

an issue I've seen in several RAG implementations is assuming that the target documents, however cleverly they're chunked, will be good search keys for incoming queries. Unless your incoming search text looks semantically like the documents you're searching over (not the case in general), you'll get bad hits. On a recent project, we saw a big improvement in retrieval relevance when we separated the search keys from the returned values (chunked documents), and we used an LM to generate appropriate keys which were then embedded. Appropriate in this case means "sentences like what the user might input if theyre expecting this chunk back"



Interesting! So you basically got a LM to rephrase the search phrase/keys into the style of the target documents, then used that in the RAG pipeline? Did you do an initial search first to limit the documents?


IIUC they're doing some sort of "q/a" for each chunk from documents, where they ask an LLM to "play the user role and ask a question that would be answered by this chunk". They then embed those questions, and match live user queries with those questions first, then maybe re-rank on the document chunks retrieved.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: