Glad to see your video, because I'm trying to use my social media chats to train my own 'digital twin', I'm currently thinking about whether 96GB of m2 max is enough for my needs, because I want to run the model training and deployment locally, if this plan is feasible, I may also do some model training locally in the future related to other more sensitive data, instead of uploading my data to gpts, which I am currently using 16GB The memory of the m2pro doesn't seem to support this idea of mine very much
I would prefer buying a mac enough for your normal coding & stuff and run the AI training on cloud servers instead of your device. You'll save a lot of money since you might hardly require training it! Just ignore my comment if you really need to frequently train Large AI models. Else, you can consider my suggestion...
Thanks for you question, apparently they have recently changed this in the llama.cpp project and now they have a more general script called convert.py that can handle different weights files formats as input and convert any of them to ggml. You can check the details from the llama.cpp GitHub reamde but it should work by running python convert.py (but I haven't tried it)
Apologies if I missed it, but did the GPU get used, and if so was shared memory useful? I'm wondering if I should get a Mac Mini with max RAM to run in GPU mode.
Hey no worries, at the end of the video I showed the gpu monitor graph and the cpu one and everything related to the LLM is running only on cpu. gpu is only used for other apps like screen recording and so.
@@human-pl7kx yes, you can check at the end of this video where I showed the Mac's Activity Monitor with the RAM around 8-9GB: ruclips.net/video/T4mJcz7dRvE/видео.html
Glad to see your video, because I'm trying to use my social media chats to train my own 'digital twin', I'm currently thinking about whether 96GB of m2 max is enough for my needs, because I want to run the model training and deployment locally, if this plan is feasible, I may also do some model training locally in the future related to other more sensitive data, instead of uploading my data to gpts, which I am currently using 16GB The memory of the m2pro doesn't seem to support this idea of mine very much
I would prefer buying a mac enough for your normal coding & stuff and run the AI training on cloud servers instead of your device. You'll save a lot of money since you might hardly require training it!
Just ignore my comment if you really need to frequently train Large AI models. Else, you can consider my suggestion...
I dont see a convert-pth-to-ggml.py file anywhere in the llama.cpp repository. Was it recently removed? Can’t proceed at all, appreciate any help
Thanks for you question, apparently they have recently changed this in the llama.cpp project and now they have a more general script called convert.py that can handle different weights files formats as input and convert any of them to ggml. You can check the details from the llama.cpp GitHub reamde but it should work by running python convert.py (but I haven't tried it)
@@enricdSo , what should be code like ?
Apologies if I missed it, but did the GPU get used, and if so was shared memory useful? I'm wondering if I should get a Mac Mini with max RAM to run in GPU mode.
Hey no worries, at the end of the video I showed the gpu monitor graph and the cpu one and everything related to the LLM is running only on cpu. gpu is only used for other apps like screen recording and so.
How many RAM does your Macbook have?
24gb but it was barely using 8gb while running it, having some chrome tabs open and the screen recording software
@@enricd 13B model?
@@human-pl7kx yes, you can check at the end of this video where I showed the Mac's Activity Monitor with the RAM around 8-9GB: ruclips.net/video/T4mJcz7dRvE/видео.html
@@enricd I cannot run llama 2 13B on a mac with 8GB. Looks like I ran out of memory.
@@human-pl7kx oh interesting... and does it work with the 7B version? Have you also any other apps open using ram apart from llama.cpp?