Are LLaVA variants better than original?

Поделиться
HTML-код
  • Опубликовано: 6 сен 2024

Комментарии • 6

  • @MrMoonsilver
    @MrMoonsilver 3 месяца назад +2

    Would be very happy to see how you did the interface with streamlit.

  • @AlexanderSuraphel
    @AlexanderSuraphel 2 месяца назад +1

    Mark, I think Phi 3 got the answer for What is this *company* most famous for? Since the company in the image is Microsoft.

  • @mariotabali2603
    @mariotabali2603 3 месяца назад +1

    This is good

  • @AdandKidda
    @AdandKidda 2 месяца назад

    great comparision.
    I have used gemini pro vision, and looking for simliar open source solution.
    can u plz suggest something for :
    "extracting data in key-value-pair from documents(images and pdfs) like invoice , forms , ids".
    a light model can help, so that it can run on less memory.
    thanks in advance . :)

    • @learndatawithmark
      @learndatawithmark  2 месяца назад

      I think to use a smaller model for a specific task we might want to fine tune something ourself with a bunch of images and the expected output. I read that PaliGemma is supposed to be a good base image for that, but I haven't tried fine tuning it myself - huggingface.co/google/paligemma-3b-mix-448