Local LightRAG: A GraphRAG Alternative but Fully Local with Ollama

Поделиться
HTML-код
  • Опубликовано: 5 янв 2025

Комментарии •

  • @optimistic_dipak8632
    @optimistic_dipak8632 Месяц назад +1

    Great Video!!! Kindly make more videos on LightRAG and all the latest cool technologies please. You are my one stop source to learn and know about new technologies. Thank you so much!!

  • @Othit
    @Othit Месяц назад +1

    wow, got it working, thank you so much. it took the better part of one day on my non-GPU laptop.
    next step is to repeat this with some cloud-based GPU horsepower.

    • @moravskyvrabec
      @moravskyvrabec Месяц назад

      Hi @Othit, sorry, I got it running, too. You can ignore my question! :-)

  • @SurajPrasad-bf9qn
    @SurajPrasad-bf9qn 2 месяца назад +1

    Thank you , your videos are helping me a lot, please keep uploading such videos

  • @TCC_thecuriouscat
    @TCC_thecuriouscat 2 месяца назад +8

    I was able to use both latest Phi and Llama models with Ollama and it works very smoothly with LightRAG. For large set of files, I was able to create Knowledge Graph based conditional Filter for LightRAG GraphML files which increased efficiency drastically otherwise hybrid query takes much longer.

    • @huuhuynguyen3025
      @huuhuynguyen3025 2 месяца назад

      how it fast compare to normal RAG?

    • @LifeCasts
      @LifeCasts 2 месяца назад

      Your model selection is itself not great

    • @TCC_thecuriouscat
      @TCC_thecuriouscat 2 месяца назад

      @@LifeCasts As long as it works well for my usecase with excellent results, I don't mind

    • @TCC_thecuriouscat
      @TCC_thecuriouscat 2 месяца назад +2

      @@huuhuynguyen3025 Normal Vector RAG is much faster. LightRAG is not that fast as it still creates both Graph and Embeddings. However tt is approx 4 to 5x faster than GraphRAG for the documents which I had tested.

    • @TCC_thecuriouscat
      @TCC_thecuriouscat 2 месяца назад +1

      ​@@LifeCasts If you are referring to Qwen model, I had also tried using it first that but for my hardware it was working very slow hence I had to switch.

  • @richardkuhne5054
    @richardkuhne5054 2 месяца назад +4

    Yes we are interested, please add multi modal and pdf processing. Also use a cheap model with prompt caching for the chunking etc. and a smart model with large context window for retrieval. To get accurate results that are vetted out. I.e gpt4o-mini for ingesting, Claude 3.5 sonnet for retrieval or so

    • @avinier325
      @avinier325 2 месяца назад +1

      can you elaborate on your process, which model you used for each part and how

  • @sammcj2000
    @sammcj2000 2 месяца назад +5

    I'd like to see a RAG system specifically built for working with large code bases. Most rag examples are optimised for document retrieval and citation, but I think there's a lot of room for advanced code modernisation / rewriting augmented with rag simply to enable working with large code bases (e.g. >100k tokens)

  • @RodCoelho
    @RodCoelho 2 месяца назад +16

    Is there an institution that ranks RAG systems? For example I would like to find out if this or multi-modal RAG from your recent video works better? Would you know?

    • @muffinmuncher
      @muffinmuncher 2 месяца назад

      there is the MTEB (massive text embedding benchmark) for embeddings models on HuggingFace. but “which would be better” depends on your application

  • @BrunoEsteves-j9l
    @BrunoEsteves-j9l 2 месяца назад +4

    How can we make it return the "reference documents" that it uses for answering?

  • @davidtapang6917
    @davidtapang6917 2 месяца назад +1

    Definitely more lightrag!

  • @henkhbit5748
    @henkhbit5748 25 дней назад

    Thanks for the update of lightrag with ollama. I am curious if you feed lightrag with a bunch of documents and how it impact the query/inference performance. In standard rag we store the embeddings in a vectorstore. Is this possible with lightrag? It would be nice to see an example with more complex documents and the embeddings stored in a vectorstore with an open source llm (for cost savings ;-) )

  • @angelfeliciano8794
    @angelfeliciano8794 2 месяца назад +3

    Thanks so much for your tutorials and demos. What if the data is related to products and I already process a txt with 200 products. Then next day the price is updated in 5 products. Do I need to process the whole list again? Does the old price will be remembered or it will be replaced from the rag?

  • @greenhost87
    @greenhost87 2 месяца назад +1

    I can't get "hybrid" mode to work with Ollama, it think like 10-15 minutes and print something unreadable as result... I try the example from repo without any modifications

  • @MeinDeutschkurs
    @MeinDeutschkurs 2 месяца назад +1

    What about options={num_ctx=32000} at the function? Is it not supported?

  • @SejalDatta-l9u
    @SejalDatta-l9u Месяц назад +1

    Everything seemed to have been taking extremely long. Not sure why on a RTX 4080 12GB VRAM.
    When I typed in Ollama ps, during the launch, II saw Qwen working but not the embedding model.Perhaps this is the problem. Does anyone have an idea why the embedding model wouldn't be running? Any tips?
    Thanks in advance

  • @brucewayne2480
    @brucewayne2480 2 месяца назад +2

    What about an existing knowledge graph in neo4j for example ? Can you enrich an existing graph ?

  • @mahmoudsamir9537
    @mahmoudsamir9537 2 месяца назад +1

    Thanks for that. I am confused with the types of queries, what are naive vs local vs global vs hybrid ?

  • @morristai2840
    @morristai2840 Месяц назад +2

    is it possible to combine multiple article and build one big knowledge graph?

    • @engineerprompt
      @engineerprompt  Месяц назад +1

      Yes, that's possible. With new updates it supports a lot more than text files now.

  • @thangdoan4831
    @thangdoan4831 Месяц назад +1

    How can I give a chunked csv or json file as input?

  • @benjaminbirdsey6874
    @benjaminbirdsey6874 2 месяца назад

    Can you summarize what they are using for the edges in their graph?
    Also, since the graph relations are in some generic text modes (json?) can you generate the graph in one LLM and run it in another LLM? Advantages? Disadvantages?

  • @SurajPrasad-bf9qn
    @SurajPrasad-bf9qn 2 месяца назад +1

    Can we use Light RAG for documents that contain images/tables and charts?

    • @vap0rtranz
      @vap0rtranz Месяц назад +1

      Any tables/charts will need a multi-modal preprocessor to convert into a format the LLMs understand, like JSON/Markdown. Docling just came out from IBM and does preprocessing of several doc types with structure inside. Unstructured is another option.

  • @SergioEanX
    @SergioEanX 16 дней назад

    Can I integrate lightrag with a vector dtabase like Weaviate?

  • @SuperJg007
    @SuperJg007 Месяц назад

    I am getting this error when I run rag.insert
    RuntimeError: This event loop is already running
    Any idea on how to fix this?

  • @PrathameshSaraf
    @PrathameshSaraf 2 месяца назад

    How does this perform with colbert or copali?

  • @jayaverma634
    @jayaverma634 2 месяца назад

    Can you please make a video on the road map for Learning LLM or generative AI

  • @edisonyehosua5188
    @edisonyehosua5188 2 месяца назад

    When I run "python examples/lightrag_ollama_demo.py", it throw me back a "500 internal privoxy error" and I do use shadowsocks, can anyone help me to tackle this problem?

  • @JeyaramanR-h8r
    @JeyaramanR-h8r 2 месяца назад

    Is there any way we can use gemini models ?

    • @engineerprompt
      @engineerprompt  2 месяца назад

      Not sure if they have openai compatible api but if not, I think they can be used via litellm proxy. Will explore

    • @nghianguyentrantrong1762
      @nghianguyentrantrong1762 2 месяца назад

      ask gemini to write code for gemini connection )))

  • @rubencontesti221
    @rubencontesti221 2 месяца назад +2

    Sorry, what version of qwen 2 are you using? 7B? Also, what minimum specs would you recommend for running it? Thank you.

    • @ShyamnathSankar-b5v
      @ShyamnathSankar-b5v 2 месяца назад +1

      4gb vram is minimum for running 7b model better use llama3.2 3b

    • @rubencontesti221
      @rubencontesti221 2 месяца назад

      @@ShyamnathSankar-b5v I just tried llama3.2:3b does not seem to be working: ``` ⠹ Processed 42 chunks, 0 entities(duplicated), 0 relations(duplicated)
      WARNING:lightrag:Didn't extract any entities, maybe your LLM is not working
      WARNING:lightrag:No new entities and relationships found
      INFO:lightrag:Writing graph with 0 nodes, 0 edges
      INFO:lightrag:Creating a new event loop in a sub-thread.````

    • @ShyamnathSankar-b5v
      @ShyamnathSankar-b5v 2 месяца назад

      @@rubencontesti221 well I have tried this by uploading a small amount of text , it works fine

    • @crane-d5d
      @crane-d5d 2 месяца назад

      ​@@rubencontesti221 have the same problem. Have you solved it?

    • @vap0rtranz
      @vap0rtranz Месяц назад

      @@crane-d5d 3b is very small. There's been some convos that large, instruct type models are needed to do the knowledge graph. One of the devs for NanoGraph, an alternative to LightRAG, suggested users switch to larger Qwen2. I think it's not coincidence that PromptEngineering is also using it. I'm running a 14b of Qwen2.5 on LightRAG and it is processing the knowledge graph.

  • @konstln5503
    @konstln5503 2 месяца назад

    Wounded what is the case of using it in real life? For what purpose?
    Graph itself without any AI it’s representation of data that not everybody is needed

  • @McMartinLC
    @McMartinLC 2 месяца назад +1

    Could you do this with spreadsheets, or even just a CSV file and then query the data with graphs and all? Also thank you for not hiding anything behind a .bat file on Patreon.

  • @johncult6948
    @johncult6948 Месяц назад

    Brother, this is too hectic. Takes a lot of time in downloading and running these models. Oh god