Retrieval Augmented Generation in the Wild: Anton Troynikov

Поделиться
HTML-код
  • Опубликовано: 22 авг 2024

Комментарии • 2

  • @Canna_Science_and_Technology
    @Canna_Science_and_Technology 9 месяцев назад +2

    FYI there is a failure of direct retrieval with GPT-4 using the new OpenAI Assistant API. GPT tokenizes text and creates its own vector embeddings based on its specific training data. The new terms and sequences may not connect well to the pretrained knowledge in GPT's weight tensors.
    There was no semantic similarity between the new API terms and GPT's existing vector space. This is a fundamental issue with retrieval augmentation systems like Rag - external knowledge is not truly integrated into the model's learned weights. Adding more vector stores cannot solve this core problem.
    The solution is to have multiple learned "knowledge planes" with trained weight tensors for specific tasks that can be switched in. This is better than just retrieving separate vector representations.

  • @Jonathan-rm6kt
    @Jonathan-rm6kt 8 месяцев назад +1

    Excellent presentation. I have found vanilla embeddings insufficient to do “level2” tasks, which require multiple pieces of context that may vary from ultra specific, to rolled up across the entire document. If anyone can link research on how to embed temporal meaning within chronological text, would love to take a look!