Видео 4
Просмотров 26 447

Uncensored self-hosted LLM | PowerEdge R630 with Nvidia Tesla P4

5:08

My Favorite 10 Logic Presets 2022

7:06

My 7 Favorite Analog Lab 5 Presets

3:46

Vinyl & Spotify with Raspberry Pi To Studio Speakers

Using a Raspberry Pi as an audio mixer to combine audio sources while having low latency and without losing quality.
Helpful Resources Repo: github.com/ConnorsApps/pipewire-video-resources
Golang pipewire monitor program: github.com/ConnorsApps/pipewire-monitor-go

Видео

Uncensored self-hosted LLM | PowerEdge R630 with Nvidia Tesla P4

5:08

Uncensored self-hosted LLM | PowerEdge R630 with Nvidia Tesla P4

Просмотров 23 тыс.6 месяцев назад

Ollama: ollama.com/ Ollama UI: github.com/open-webui/open-webui OS: Ubuntu 24.04 Nvidia with Kubernetes: github.com/NVIDIA/k8s-device-plugin Benchark Program: github.com/ConnorsApps/ollama-benchmarks VM in k8s: github.com/linuxserver/docker-webtop/ The k8s manifest I used: gist.github.com/ConnorsApps/362b54f92392d93dd5ea6c92df2d52b1 Featured video: "How to install a Graphics Card in a Rack Serv...

7:06

My Favorite 10 Logic Presets 2022

Просмотров 702 года назад

1. Hifi pop drums Love for pop indie sound Pull top end down EQ Add compressor VCA Smashed uses the vintage compressor 2. Liverpool Bass with Distortion up 3. Tin Can Mallets 4. Trap Bass 5. Sweet Overdrive Kalimba 6. Roland CR-78 7. 808 Flex 8. Autumn Leaves 9. Liquid Synth Keys 10. 70s Analog Lead Human Body Rhythm Effects - ew

3:46

My 7 Favorite Analog Lab 5 Presets

Просмотров 2,9 тыс.3 года назад

Check out Analog Lab www.arturia.com/products/analog-classics/analoglab-v/overview

@alfredvarela2119 24 дня назад
Do you recommend this gpu to a 2b model?
@Connorsapps 24 дня назад
@@alfredvarela2119 the rule of thumb I went by is to look at the model storage size and make sure it can fit in all memory. So as long as it’s under 8GB you’re good
@user-qv1no Месяц назад
3:36 This is a bad habit: you should not plug or unplug hardware while it is powered on.
@Connorsapps Месяц назад
@@user-qv1no I believe I turned it off with the power button I just didn’t completely unplug everything.
@thedevhow Месяц назад
Have you looked at Tesla T4?
@Connorsapps Месяц назад
@@thedevhow the price is mainly what scared me off for now, I’d need a better use for my servers GPU then what I’m currently doing
@lspecian 2 месяца назад
Drivers and OS? I couldn’t get that from the video
@Connorsapps 2 месяца назад
Good point, I'll update the description. I run Ubuntu 24.04. I'm using github.com/NVIDIA/k8s-device-plugin for working with nvidia GPUs in a Kubernetes cluster. That page provides other guides on getting OS specific drivers installed.
@UnfiItered 2 месяца назад
Just bought myself an r630 e5-2690v4 128gb to self host gaming server and other things. Is t4 really the best we can do without modifications? Ugh, if so, im so mad i didn't go with the xd version so I can get a better gpu for inference and transcoding.
@videowatcher495 2 месяца назад
The one minute mark threw me for a loop... Then I just laughed really hard. Thanks.
@jco997 3 месяца назад
Interesting setup on an Intel Xeon E5-2640. I'm trying the same with my AMD Ryzen 5600GT, but still haven't decided if I should get the M40 with 24 GB of RAM, or the "newer" Tesla P4.
@Connorsapps 3 месяца назад
@@jco997 the m40 is quite a bit longer and I would have gotten the the p40 or m40 if I could. What server do you have?
@jco997 3 месяца назад
@@Connorsapps My comment was deleted for posting a link, but is a custom build AMD Ryzen 5600GT. Your's, I think, must be an Xeon E5-2640 v4, considering you have a Core Count of 20.
@jco997 3 месяца назад
I normally use cpu benchmarks from passmark, since it gives me a ballpark figure on how much performance I could expect from any CPU model.
@k01db100d 4 месяца назад
llama.cpp works fine on CPU, it's slower than on GPU but still usable
@mopeygoff 4 месяца назад
Good video. I run a similar setup on an R-720, but i'm using an RTX 2000 Ada Gen (16gb). No external power needed, uses a blower style fan so no need for an "external" cooler solution, really, but they run about $500-$600 on ebay. I got mine for $550. I'm on the hunt for another one. It's basically an Nvidia 3060 with a couple hundred more tensor cores and more vram. So not too shabby. I'm using a proxmox container for the AI Gen stuff. My model is a fine-tuned version of Dolphin-Mistral 2.6 Experimental with a pretty chonky context window.
@alivialee 4 месяца назад
nice shots of your record player
@HaydonRyan 4 месяца назад
What cpu or cpus do you have? I’m looking at a gpu for my r7515 for ollama.
@Connorsapps 4 месяца назад
@@HaydonRyan 2x Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz, 8 cores each
@TheSmileCollector 5 месяцев назад
Could you fit two Tesla P4? Also what os you using on your machine?
@Connorsapps 5 месяцев назад
@@TheSmileCollector it could fit another one but I’d have to remove its idrac module. Ubuntu server.
@Connorsapps 5 месяцев назад
What OS do you usually use?
@TheSmileCollector 4 месяца назад
@@Connorsapps Sorry for the late reply! Just got proxmox on mine at the moment. Still in the learning stages of servers.
@Flight1530 5 месяцев назад
so when are the other Gpus coming in?
@Connorsapps 5 месяцев назад
@@Flight1530 I just got a 4GB NVIDIA GeForce RTX 3060 for a normal pc but maybe I could get some massive used ones for heating my house once the AI hype cycle is over.
@Flight1530 5 месяцев назад
@@Connorsapps lol
@Benderhino 5 месяцев назад
I love how sarcastically he was talking about piracy
@guytech7310 3 месяца назад
Too bad he could fine a public domain video about pirates for his video.
@technotic_us 5 месяцев назад
I have a PER730 8LFF running unraid. I found this video with a very vague search (tesla llm for voice assistant self hosted) but I was looking at the Tesla P4 for all the same reasons. 75w max. I don't want my r730 going into r737-max mode (with the door plug removed in flight, so you get the full turbine sound in the cabin, if you want that "riding on a wing and a prayer" vibe, like you're literally strapped to and riding on the wing during flight). I considered the p40 but I'm in California, the electricity cost difference could be a week worth of groceries in the Midwest, or lunch and dinner here... Thankfully theres one on ebay for only a couple dollars more than china and i can have it in 3 days. But its good to see someone else with basically the same use case. Also running jellyfin, and wanted acceleration for that too. Anyway glad you did this. Your vid made me confident in the $100 for a low budget accelerator. Btw what is your cpu/ram config? Im on 2x e5-2680v4 14cx2 (28c56t) and 128gb 2400 ddr4 ecc. Everything i want to accelerate is in containers so i should be good. Thanks again 👌
@Connorsapps 5 месяцев назад
In the midwest, food cost is actually pretty dang close to everywhere else but you're definitely right on the electricity. I made this video due to the lack of content on this sorta thing so I'm very glad it was worth the time. 2x CPUs Intel Xeon E5-2640 v3 (32) @ 3.400GHz Memory: 6x 16GB DDR4, in total: ~95GB
@DB-dg9lh 5 месяцев назад
Ya might want to try blur that receipe again. I can read it pretty easily.
@Connorsapps 5 месяцев назад
Oops. I added some extra blur now thanks
@LeeZhiWei8219 5 месяцев назад
Interesting! A tour of the homelab maybe? Subscribed!
@SamTheEnglishTeacher 5 месяцев назад
Did the instructions it gave you actually work though? If so, I expect a lot more output from your channel, although it may become nonsensical over time.
@Connorsapps 5 месяцев назад
I've already started using TempleOS
@SamTheEnglishTeacher 5 месяцев назад
@@Connorsapps based. After all what are LLMs but a scaled up version of Terry's Oracle application
@Connorsapps 5 месяцев назад
@@SamTheEnglishTeacher hhahaha i forgot about that
@loupitou06fl 5 месяцев назад
Great video, I got my hands on a couple of supermicro 1U servers and tried the 1st part (CPU only) of your video, is there any other GPU that would fit in that slot ?
@Connorsapps 5 месяцев назад
The GeForce GT 730 will as seen here: ruclips.net/video/5kueBAgigj4/видео.htmlsi=Bl1zuecYDxfYJNgQ&t=188 but you've gotta cut a hole for airflow. You're super limited if you don't have an external power supply so I'd consider buying a used gaming pc and using it as a server.
@vulcan4d 5 месяцев назад
A good test would be to show how many tokens/sec you got instead of duration.
@guytech7310 3 месяца назад
answer: less than 1 token per second. P4 just doesn't have enough go to make it a useable solution
@internet155 5 месяцев назад
pull the lever kronk
@JoeCooperTech 5 месяцев назад
Brilliant work. Really well done, Connor. New subscriber here.
@mishanya1162 6 месяцев назад
Nah guys, 8gb vram is too little I just tried 8B llama3.1 and its trash So, buying this will.... Its better to just pay for chatgpt or others
@Connorsapps 5 месяцев назад
ChatGPT can’t help with basic daily tasks like making meth as shown in video
@AprilMayRain 6 месяцев назад
Have an r720 with a GTX 750ti and need more uses for it! Do you think the 2GB of VM would make any difference for Ollama?
@Connorsapps 6 месяцев назад
100% for the smallish models. It's definitely worth trying out a few to see. I'd first try ollama.com/library/gemma:2b then maybe ollama.com/library/llama3.1:8b to see what happens.
@Flight1530 6 месяцев назад
I just found this channel, I hope you do many more LLM with your servers.
@jaykoerner 6 месяцев назад
+19:20 you know you can still read that blurred text right.... At least I can
@Nightowl_IT 6 месяцев назад
Mhm ruclips.net/video/t4J_KYp0NGM/видео.html
@halo64654 6 месяцев назад
For anyone trying this on old enterprise hardware on top of VMs. Tread carefully with the HPE Gen 7 through 8. There's a bios bug that will not allow you to do PCI passthough and you wont be able to do anything PCI related. Also, underated channel.
@JzJad 5 месяцев назад
Im guessing this is on specific bios versions, have done pci pass through on some gen 8s and luckily did not have any issues.
@halo64654 5 месяцев назад
@@JzJad Mine is a G7. I'm personally on the most recent BIOS version. I've pretty much given up trying to make it work.
@JzJad 5 месяцев назад
@@halo64654 I had done it with VMware and proxmox once I do remember proxmox being a bit more of a paint and having issues in some slots but never realized it was a HP BIOS issue,rip
@nokel2 3 месяца назад
dang, just got a p4 and have a hpe g8... welp, worst case scenario is that I can get a better server in the future I guess... Or sell the card if I really have to...
@halo64654 3 месяца назад
@@nokel2 I've heard gen8 has better results with workarounds as those tend to be more favored by the community. I have a gen7.
@lundylizard 6 месяцев назад
Nice video :)
@MrButuz 6 месяцев назад
Good interesting video.
@taktarak3869 6 месяцев назад
Thank you. I've been thinking of starting my own home lab for final year project, wasn't able to find a source of where i should start with :) cheers mate
@Connorsapps 6 месяцев назад
I’d love to hear more about it. So do you have any particular hardware in mind?
@taktarak3869 6 месяцев назад
@@Connorsapps There are a few IBMs around near my local. I probably can start with them. The last time i try a Supermicro it didn't like some gpus. I have plenty of gpus laying around too, mostly Quadro cards or Tesla. Recently got a batch of AMD's vega gpus (like the 56 and 64) from a retired mining ring too. Since Ollama are getting support for them, i believe it's worth a try.
@cifers8928 6 месяцев назад
If you can fit the entire model into your GPU you should use exl2 for free performance gains with no perplexity loss
@roykale9141 6 месяцев назад
Ok this was funny and educative
@FroggyTWrite 6 месяцев назад
the r630xd and r730xd have room for a decent sized GPU and PCI-E power connectors you can use with adapters
@Connorsapps 6 месяцев назад
I was actually looking into buying one of those models but I couldn’t justify another heat generating behemoth in my basement
@alivialee 6 месяцев назад
love the emperor's new groove reference haha
@bennett1723 6 месяцев назад
Great video
@TheCreaperHead 6 месяцев назад
this was a well made video, is this channel going forward going to be about home lab or server stuff? Im working on my own home lab with Ollama3 with my 3090 fe (ik its overkill lol) and I love seeing ppl make their own stuff. Also, do you know how to make 2 gpus work for Ollama? I added in a 3060ti fe and it isnt being used at all with Ollama3
@Connorsapps 6 месяцев назад
Programming and tech is my biggest hobby so next time I have a bigger project I’ll probably make a video. Depending on the models you’re using GPU memory seems to be the real bottleneck. As for getting 2 GPUs to work for ollama I wouldn’t think this would be supported. Here’s a GitHub issue about it github.com/ollama/ollama/issues/2672
@mopeygoff 4 месяца назад
I have not been able to split a model across multiple gpus, but Ollama has loaded a second model to a second GPU, or offloaded a part of a model to the CPU. I have an RTX 2000 Ada Gen (16gb) and an old NVIDIA 1650. With the context window, my main LLM is about 12.5GB or so. That goes onto the Ada Gen. When I send something to the 4gb llava/vision model it dumps most of it onto the 1650, with a small chunk going to CPU. It is significantly slower than the main model but not annoyingly so (and hey, I only use it occasionally).
@shreyasbhat 6 месяцев назад
The title says Tesla P40, but you are using Tesla P4. I'm not sure if the title is wrong or if I got it wrong. Aren't they different GPUs?
@Connorsapps 6 месяцев назад
Oops
@trolledepicpeeterstyle1678 6 месяцев назад
I like this video, keep this up!
@noth606 6 месяцев назад
You know there's a button to save you the time to express this as a comment, right? As a bonus it tells YT that you like it too, so it can be prioritized higher in searches and stuff 😉
@misterpmacd 2 года назад
Connor Skees was born to be famous.
@Connorsapps 2 года назад
6 views woo hoo
@misterpmacd 2 года назад
11/10 would recommend to a friend.
@opensh0t 2 года назад
hey its beck49. i don't know if you remember me but i was an admin for your minecraft server 7 years ago. i didn't know if this was you at first but i saw i was already subscribed and your voice sounded very familiar, so i connected the dots.
@Connorsapps 2 года назад
Haha yes I do remember. Oh Minecraft, those were the days.
@Harry-dk2yd 2 года назад
thanks for sharing those very cool synths that I've never played with (since the preset factory is huge af)
@Connorsapps 2 года назад
Yep yep, I’m very picky so I was surprised there weren’t that many other videos with good presets
@Harry-dk2yd 2 года назад
@@Connorsapps make some more vids
@IsraelMolina1997 2 года назад
Cool
@ChiragR-007 2 года назад
Thanks
@kevinkramolis7800 2 года назад
Some cool synths here. Thank you!
@misterpmacd 3 года назад
amazing
@fanimations.co2023 3 года назад
Beautiful man

Connor

Комментарии