Running Multimodal Models with KoboldCPP

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024
  • In this video we quickly go over how to load a multimodal into the fantastic KoboldCPP application. While the models do not work quite as well as with LLama.cpp directly from the terminal, you will see that they are a lot of fun!
    These lava based models are important as they help us to get details that we can then feed back into other AI models for the purpose of generation or restoration of existing images. However, with KoboldCPP we'll find that the application lies mostly with narrative and storytelling.

Комментарии • 1

  • @Henk717
    @Henk717 Месяц назад

    Thanks for checking it out! Got two tips for you based on the videeo:
    You mention its not giving you the same results as when using it from the command line, I assume this may be related to the prompt template. In the settings at the right bottom (We are currently overhauling the settings screen so it may be at a different place soon thats not yet decided) there is the prompt format. If your using for example ChatML in llamacpp but Alpaca in this one then that would result in different kinds of verbosity depending on the model your using.
    Second tip, you can actually combine the vision part with any tune thats using the same base model. I got the Llama 13B llava adapter paired with Tiefighter for example. Works pretty well.
    And lastly in the Add Img customization settings is an option to allow storing images in higher resolution, that can also help the image recognition out.
    Of course this is also fully compatible with the OpenAI emulation for those needing llava in an alternative UI that has image recognition support over their API.