Hopefully we can get a better RAG out of it. Currently people do incredibly prim...

mediaman · on Feb 15, 2024

A lot of people in RAG already do this. I do this with my product: we process each page and create lists of potential questions that the page would answer, and then embed that.

We also embed the actual text, though, because I found that only doing the questions resulted in inferior performance.

CharlieDigital · on Feb 15, 2024

So in this case, what your workflow might look like is:

    1. Get text from page/section/chunk
    2. Generate possible questions related to the page/section/chunk
    3. Generate an embedding using { each possible question + page/section/chunk }
    4. Incoming question targets the embedding and matches against { question + source }

Is this roughly it? How many questions do you generate? Do you save a separate embedding for each question? Or just stuff all of the questions back with the page/section/chunk?

mediaman · on Feb 16, 2024

Right now I just throw the different questions together in a single embedding for a given chunk, with the idea that there’s enough dimensionality to capture them all. But I haven’t tested embedding each question, matching on that vector, and then returning the corresponding chunk. That seems like it’d be worth testing out.