Local ChatGPT on MacBook Air M2 - Running the best Open 13B LLM with llama.cpp from scratch

Поделиться
HTML-код
  • Опубликовано: 27 окт 2024

Комментарии • 14

  • @laobaGao-y7f
    @laobaGao-y7f 10 месяцев назад +1

    Glad to see your video, because I'm trying to use my social media chats to train my own 'digital twin', I'm currently thinking about whether 96GB of m2 max is enough for my needs, because I want to run the model training and deployment locally, if this plan is feasible, I may also do some model training locally in the future related to other more sensitive data, instead of uploading my data to gpts, which I am currently using 16GB The memory of the m2pro doesn't seem to support this idea of mine very much

    • @sanchogodinho
      @sanchogodinho 5 месяцев назад +1

      I would prefer buying a mac enough for your normal coding & stuff and run the AI training on cloud servers instead of your device. You'll save a lot of money since you might hardly require training it!
      Just ignore my comment if you really need to frequently train Large AI models. Else, you can consider my suggestion...

  • @Panpaper
    @Panpaper 10 месяцев назад +3

    I dont see a convert-pth-to-ggml.py file anywhere in the llama.cpp repository. Was it recently removed? Can’t proceed at all, appreciate any help

    • @enricd
      @enricd  10 месяцев назад

      Thanks for you question, apparently they have recently changed this in the llama.cpp project and now they have a more general script called convert.py that can handle different weights files formats as input and convert any of them to ggml. You can check the details from the llama.cpp GitHub reamde but it should work by running python convert.py (but I haven't tried it)

    • @manishraj-rz4lh
      @manishraj-rz4lh 8 месяцев назад

      @@enricdSo , what should be code like ?

  • @DocuFlow
    @DocuFlow Год назад +1

    Apologies if I missed it, but did the GPU get used, and if so was shared memory useful? I'm wondering if I should get a Mac Mini with max RAM to run in GPU mode.

    • @enricd
      @enricd  Год назад

      Hey no worries, at the end of the video I showed the gpu monitor graph and the cpu one and everything related to the LLM is running only on cpu. gpu is only used for other apps like screen recording and so.

  • @human-pl7kx
    @human-pl7kx Год назад +2

    How many RAM does your Macbook have?

    • @enricd
      @enricd  Год назад

      24gb but it was barely using 8gb while running it, having some chrome tabs open and the screen recording software

    • @human-pl7kx
      @human-pl7kx Год назад +1

      @@enricd 13B model?

    • @enricd
      @enricd  Год назад

      @@human-pl7kx yes, you can check at the end of this video where I showed the Mac's Activity Monitor with the RAM around 8-9GB: ruclips.net/video/T4mJcz7dRvE/видео.html

    • @human-pl7kx
      @human-pl7kx Год назад +1

      @@enricd I cannot run llama 2 13B on a mac with 8GB. Looks like I ran out of memory.

    • @enricd
      @enricd  Год назад

      @@human-pl7kx oh interesting... and does it work with the 7B version? Have you also any other apps open using ram apart from llama.cpp?