Building a Vision App with Ollama Structured Outputs

Поделиться
HTML-код
  • Опубликовано: 3 янв 2025

Комментарии • 23

  • @davidmccauley7822
    @davidmccauley7822 4 дня назад +19

    I would love to see a simple example of how to fine-tune a vision model with ollama.

  • @chizzlemo3094
    @chizzlemo3094 4 дня назад +3

    OMG, this is exactly what I need. Thanks so much.

  • @suiteyousir
    @suiteyousir 2 дня назад

    Thanks for these updates, quite difficult to keep up with all the new releases nowadays

  • @bigfootpegrande
    @bigfootpegrande 3 дня назад

    Miles and IA? I'm all for it!

  • @gr8tbigtreehugger
    @gr8tbigtreehugger 4 дня назад +1

    Cool to see how you approached NER using an LLM. I've been using SpaCy.

    • @samwitteveenai
      @samwitteveenai  3 дня назад

      I normally use Spacy for anything at scale. You can use LLMs to make good datasets for custom entities and then use that to train the Spacy model

  • @PandoraBox1943
    @PandoraBox1943 День назад

    very useful

  • @sridharangopal
    @sridharangopal 2 дня назад

    Great videos, Sam. Learnt so much from your videos. RE: Llama vision model on Ollama, I have been trying to get it to work with both pictures and tools but it looks like it can only do pictures and structured output and no tool calling support yet. Any idea on how to get around this limitation?

  • @loudmanCA
    @loudmanCA 4 дня назад

    Really appreciate your channel! Could you make a video to help us better understand what specs are required for using LLMs locally?

  • @parnapratimmitra6533
    @parnapratimmitra6533 4 дня назад

    Very informative video regarding Vision based models with structured outputs. If possible, could you also make a video on a simple langchain or langgraph app using vision based models of ollama for reading and describing into structured outputs, all the images in a document let's say pdf? Thanks in advance

  • @justine_chang39
    @justine_chang39 2 дня назад

    do you know if this model would be good for getting the coordinates of stuff in images? For example I would like to get the coordinates of a dog in an image, the model might return a bounding box [[x1, y1], [x2, y2]]

    • @samwitteveenai
      @samwitteveenai  День назад +1

      These models are probably not good enough for that at the moment, but certainly things like the new Gemini model can do that kind of task.

  • @austinlinco
    @austinlinco 2 дня назад +1

    I literally thought of this yesterday and was using a system prompt to force it to respond as a dictionary
    Wtf is up with 2025 being perfect, and what’s the catch

  • @nufh
    @nufh 4 дня назад

    I have tried it, it depends on the model itself.

  • @pensiveintrovert4318
    @pensiveintrovert4318 4 дня назад +1

    The amount of hacking you have to do to just get "ok" results says it all. Not production quality, and won't be any time soon.

    • @brando2818
      @brando2818 4 дня назад +2

      Have you tried it with better models than were used here?

    • @adriangabriel3219
      @adriangabriel3219 4 дня назад

      that's not to be expected with models of that size

    • @pensiveintrovert4318
      @pensiveintrovert4318 4 дня назад +1

      @@brando2818 the whole point of using Ollama is to run open source models, on your own hardware. OpenAI, Anthrop\c, Google already offer structured output.

  • @RohitSharma-uw2eh
    @RohitSharma-uw2eh 4 дня назад +1

    Why Hindi audio track is not available 😢

  • @surfaceoftheoesj
    @surfaceoftheoesj 23 часа назад

    very useful