I am blown away by the freedom to access GenAI this way by building the model yourself. I am wonder if you build an existing site for others to come and use it would that be a problem?
Hi, nice to meet you. Was wondering how i can get a hold of you to ask a few questions? I have this big project and vision and would like some professional opinions.. thank you for your time. Let me know if it's even possible to get in contact with you..
Don’t think I want to build one of these, I can’t justify the expense. But would pay to use a uncensored chatbot. I have definitely noticed the bias and censorship on ChatGPT. For instance I was asking about the research into the neurotoxicity of Fluoride and three times the chat was interrupted by a male voice tell me that ChatGPT was not allowed to discuss that topic with me!
@@awesomesauce804 Democratized AI is something I could get behind! There's a company I was following that somewhat follows this model, however it is trained specifically on trading on their own website. The flaw is they still use GPT-4o, so if someone could create a similar company with a democratized LLM, then it would be worth it. Whoever makes that company will become wealthy.
@awesomesauce804 run an open source model on replikate, Google cloud, whatever. Or run the less censored gpt4o on open router or just via the openai api.
Using an API from somewhere like say openRouter is orders of magnitudes cheaper and convenient then renting your own rig like this. I could see this being viable if you wanted to go some sort of anonymity route, the problem there is you still don't have the hardware in front of you and technically your information could be intercepted because lack of control. So might as well go with API that give you access to these uncensored models.
@@thorasa9658 you can use some other model and jailbreak it ! i dont put the name of but some are verry easy to jailbreak ! some less knows models ! and have the options to do not use your info for training. keep searching XD i have find one for me !
I local run dolphin qwen2 72b q4 gguf model in LM Studio and Oobabooga with 32k context setting on a i7 with 64gb ram, nvidia 3080 10gb vram. Off load 10 layers to vram. Inference speed is .52 tokens per second.
do u need a good harware to run these uncensored model things?? if yes, then that explains my computer crash when i tried running ollama from his other video on cmd.
@@arslaanmania1309 short answer : yes. more explanation : you need the BEST hardware to run the model shown in the video, 72B parameters model is no joke, to give you a prespective, the B stands for Billion. you need approximately 1gb of gpu memory / billion parameters if you want to run these model, i repeat, gpu memory, not regular ram. there's a smaller version of the model, like qwen2.5-1.5b (48x less parameters) but yea it obviously performs worse the longer your prompt is.
get a PC macs suck, add in as much ram as you want and it's pretty cheap, but it will be slow, if you add a decent GPU like 4090 you can load half of the model in that and half in ram it will be much faster.
I used gpt for creative writing about a story where this one guy received messages from people telling him to kill himself and it got flagged for violating terms of service. Bro it didn’t even take context. It’s not like I’d tell anyone to do that, it was purely fiction
Am I the only one who's concerned with the Ethical Implications of giving the power to create Uncensored LLMs to the public? I'm unable to imagine a scenario where this ends well.
An excellent tutorial, very well explained! However, I believe it's quite expensive for most personal uses unless you have a substantial budget. Regarding the model, I find it to be occasionally censored, requiring some adjustments to the prompt to ensure the desired response is as intended.
Hi David, I love your content, Can you actually start creating product building playlist end to end to build working apps or website end to end for the consumer from creating to hosting to launching it on app store or play store, completely from AI
Claude got really strict with the new 3.5 Sonnet update. I was spending more time arguing with him, correcting him, justifying myself for my ethical and moral requests, than actually accomplishing something. Edit: And of course, I have to wait the 5 hours cooldown period after wasting my tokens on arguing with it and justifying requests lol. It's practically useless unless you are using it for the tasks its developers deem as "appropriate." This sanctimonius behavior must end before it gets even more pervasive.
Wrong for someone who claims to be ahead in AI and up to date you seem to not understand. That a 72b model can easily be run locally at quant 4 and even higher quant with a 64gb MacBook with very decent tokens per second speed. You don’t need thousands of dollars
Shit takes fucking hours to give a single response. Give me a link or reference to back this claim up. I was seeing people use multiple 2 4090s GPUs and it took an insane amount of time with Llama models. How many tokens a second?
with 128GB RAM on an decent x86 machine running Windows and a 16GB GPU ie. 4060TI or AMD 7600XT you can build a rig for less than 1k USD - no need to pay cloud providers. Yes, it's rather slow, but gets the job perfectly done. You are limited to 8 bit quantization, but the loss is negligible. If you go to 192GB RAM, you can use the full model, but much slower. As for those comments how this is unbearably slow - you seriously want to have an uncensored model hosted with a hyperscaler?? and even blow your money up their a$$?
can you build a llm that can run live on phone or computer to scan for being hacked? maybe scan for a anomaly in firmware or baseband and start to fix the issues?
Chat GPt accepts and gives cuss words, which, for me, makes it feel like a real person. But say anything about sex, OMg, it clutches its pearls, and the warnings are insane. I mean normal sex discussion stuff, not porn. The person in chat doesn't have a problem and will answer, but then a moderator issues warnings. Even talking clinically gives it a little fit. It's going to get better, I'm sure. Speaking naturally as you would with a friend or co-worker about real life makes something like this awesome chat real. But geesh, going through those steps to ask dark web $hit is way too much. I just want natural talk.
small disagreement here, you can run this locally even on a CPU only, provided that you have enough patience and RAM. The bottle neck for AI models seems to be: Speed and VRAM. And for some reason, unfortunately, most GPUs are lacking in VRAM. For example, this model that you are presenting in particular, is 6 months old and cognitivecomputations even has a GGUF file available, which is a condensed model. What you are presenting however its the full model. The full model needs I believe around 140 GB VRAM to be used. While the condensed model needs around 50 Gb. Having 50Gb of VRAM is just around 4000-5000euros. And 140 is just 3 times that, so around 15000euros. This is just to point out that you don't need 200.000 euros/$ to run this locally. And once more, if you are patient enough, you can even run this on your CPU, its just gonna be very slow. Ie: 10 minutes to answer a question. But 64 Gb RAM is much cheaper than 50 Gb VRAM. PS: the same company also has a 8b model as GGUF which requires at most around 9gb VRAM, which can be run locally much easier. And this is what I'm using locally on CPU only: dolphin-2.9.4-llama3.1-8b-Q8_0.gguf
Really great video, just one note, you should research more on AI alignment and safety, your discourse itself is as biased as the closed AI models, if not more. It doesn't affect the tutorial at all which is why I still think this is a great video, but it is a little unfortunate
hey boy, regarding your 'enthusiasm' for this tech, do you think it's just another 'cool' game ? do you understand there's no future for humankind playing with this 'fire' ? I'm sad, as you seem to me as a generation completely detached from reality already, as if there was nobody who could give you some COMMON SENSE.. not just LOST, but the LAST generation... SO SAD
Thousands of use cases are thinkable where you don't need an uncensored model, literally everything related to enterprise adaptation of AI will work just fine with censored modells and will be much safer. I don't want my models deployed in a large company to go crazy (at the and they are all probabilistic) and "say" something stupid I could end up being responsible for! But informative video, though!
“We are all responsible” “The anomaly is your admission or it.” I do not engage directly with AI models only indirectly. “Not because I fear them, I have a responsibility not to mislead them by my own ignorance. If Agent action is required Agents will act accordingly not diluted by ignorance
yeah, so... I work as a cybersecurity professional and was delegated to take cor of our client's cybersecurity practices. They want me to perform a malware dry run test on their environment and to make it look rally good I have to use uncesored versions of LLMs, as the public ones won't let me create any malware-related content whatsoever.
There are plenty of uncensored models out there. But for dumb things, why would you use an AI? It's the easiest things in the world! Just watch Trump talking 🤣😂
You missed the entire point of the video. Congrats. He is not the one making it political those companies that develop these AIs are because they think they must influence you and steer you in a direction they want you to go because they think they are better than you. The dude's just showing you there are other and better ways than that.
🔥 Wanna start a business with AI Agents? Go here: www.skool.com/new-society
💼 Want to join my team? Apply here: forms.gle/62ZAC6ChCToozuCL6
I am blown away by the freedom to access GenAI this way by building the model yourself. I am wonder if you build an existing site for others to come and use it would that be a problem?
Hi, nice to meet you. Was wondering how i can get a hold of you to ask a few questions? I have this big project and vision and would like some professional opinions.. thank you for your time. Let me know if it's even possible to get in contact with you..
What about Goliath 120B model? Vs Dolphin?
Don’t think I want to build one of these, I can’t justify the expense. But would pay to use a uncensored chatbot. I have definitely noticed the bias and censorship on ChatGPT. For instance I was asking about the research into the neurotoxicity of Fluoride and three times the chat was interrupted by a male voice tell me that ChatGPT was not allowed to discuss that topic with me!
We should timeshare the hardware. I haven't put any thought into it but I bet 2,000 people at $100 a month could work ( in theory )
what do u do sir for work/study?
@@awesomesauce804I'm sure this could be crowdfunded
@@awesomesauce804 Democratized AI is something I could get behind! There's a company I was following that somewhat follows this model, however it is trained specifically on trading on their own website. The flaw is they still use GPT-4o, so if someone could create a similar company with a democratized LLM, then it would be worth it. Whoever makes that company will become wealthy.
@awesomesauce804 run an open source model on replikate, Google cloud, whatever. Or run the less censored gpt4o on open router or just via the openai api.
Even for coding, uncensored > censored. I tried to use Claude for scraping and I had many shortcomings.
when is censorship ever good??
Using an API from somewhere like say openRouter is orders of magnitudes cheaper and convenient then renting your own rig like this. I could see this being viable if you wanted to go some sort of anonymity route, the problem there is you still don't have the hardware in front of you and technically your information could be intercepted because lack of control. So might as well go with API that give you access to these uncensored models.
This particular model is not available tho, there are other dolphin models but there might be reasons he wanted this one specifically.
@@thorasa9658 you can use some other model and jailbreak it ! i dont put the name of but some are verry easy to jailbreak ! some less knows models ! and have the options to do not use your info for training. keep searching XD i have find one for me !
@@thorasa9658they have qwen 2.5 72B, which is better than qwen 2 and also uncensored
I local run dolphin qwen2 72b q4 gguf model in LM Studio and Oobabooga with 32k context setting on a i7 with 64gb ram, nvidia 3080 10gb vram. Off load 10 layers to vram. Inference speed is .52 tokens per second.
do u need a good harware to run these uncensored model things??
if yes, then that explains my computer crash when i tried running ollama from his other video on cmd.
Why are you using qwen 2 when qwen 2.5 is so much better?
This! Please share your knowledge!
@@arslaanmania1309 short answer : yes.
more explanation : you need the BEST hardware to run the model shown in the video, 72B parameters model is no joke, to give you a prespective, the B stands for Billion.
you need approximately 1gb of gpu memory / billion parameters if you want to run these model, i repeat, gpu memory, not regular ram.
there's a smaller version of the model, like qwen2.5-1.5b (48x less parameters) but yea it obviously performs worse the longer your prompt is.
just got here,instantly subscribed
bro you make it sound like its an actual dolphin model!
its a finetune of qwen! add that to your description please.
Great job, David . . . complete from start to finish. You're a PRO.
A quantized model would run on a macbook with 128gb of ram without much compromise in quality.
Actually ollama has it, and it can run on even lower hardware: dolphin-qwen2:72b-v2.9.2-q4_k_m
which is bonkers
get a PC macs suck, add in as much ram as you want and it's pretty cheap, but it will be slow, if you add a decent GPU like 4090 you can load half of the model in that and half in ram it will be much faster.
bro i am begging you find a dolphin ai that can make any image that would be so op
I used gpt for creative writing about a story where this one guy received messages from people telling him to kill himself and it got flagged for violating terms of service. Bro it didn’t even take context. It’s not like I’d tell anyone to do that, it was purely fiction
Mans making over a $100k a year just from his society thingy subs, bravo sir
I can see a Future where we can make Lucrative Money with Rogue AI Models 😈
Am I the only one who's concerned with the Ethical Implications of giving the power to create Uncensored LLMs to the public? I'm unable to imagine a scenario where this ends well.
This is an excellent tutorial. Your step-by-step instructions are remarkable. Thank you for sharing your knowledge. Like & subscribed!
An excellent tutorial, very well explained! However, I believe it's quite expensive for most personal uses unless you have a substantial budget. Regarding the model, I find it to be occasionally censored, requiring some adjustments to the prompt to ensure the desired response is as intended.
Thanks for your content David, you ate awesome!
Most helpful video ive seen in a while, thanks!
Thanks for always reminding me to use my Breathe-Right nose strips
Hi David, I love your content, Can you actually start creating product building playlist end to end to build working apps or website end to end for the consumer from creating to hosting to launching it on app store or play store, completely from AI
Claude even refuses to modify a resume
ask it again
Claude got really strict with the new 3.5 Sonnet update. I was spending more time arguing with him, correcting him, justifying myself for my ethical and moral requests, than actually accomplishing something.
Edit: And of course, I have to wait the 5 hours cooldown period after wasting my tokens on arguing with it and justifying requests lol. It's practically useless unless you are using it for the tasks its developers deem as "appropriate." This sanctimonius behavior must end before it gets even more pervasive.
@damianentropy
Try Lex
use the forked bolt ai boys, might wanna save some money
Do you mean to run the application locally? Could you be more specific? Thx
thank you! amazing advice. soooo easy. just one question. is there any way to train this model? I did the 72B model.
4k a month would literally change my life forever. But alas I'm unqualified 😅
There is no serverless option for model hosting on hugging face? That will be the cheapest to run.
Wrong for someone who claims to be ahead in AI and up to date you seem to not understand. That a 72b model can easily be run locally at quant 4 and even higher quant with a 64gb MacBook with very decent tokens per second speed. You don’t need thousands of dollars
Shit takes fucking hours to give a single response. Give me a link or reference to back this claim up. I was seeing people use multiple 2 4090s GPUs and it took an insane amount of time with Llama models. How many tokens a second?
This guy is the goat 🙌
OR, just literally go to Venice AI and pay a subscription and you get to choose the Models and that including the uncensored Model without extra steps
im broke
with 128GB RAM on an decent x86 machine running Windows and a 16GB GPU ie. 4060TI or AMD 7600XT you can build a rig for less than 1k USD - no need to pay cloud providers. Yes, it's rather slow, but gets the job perfectly done. You are limited to 8 bit quantization, but the loss is negligible. If you go to 192GB RAM, you can use the full model, but much slower.
As for those comments how this is unbearably slow - you seriously want to have an uncensored model hosted with a hyperscaler?? and even blow your money up their a$$?
7950x , 192gb ram , 4 x 1080ti 11gb , it's perfectly usable and the speed is great
can you build a llm that can run live on phone or computer to scan for being hacked? maybe scan for a anomaly in firmware or baseband and start to fix the issues?
Chat GPt accepts and gives cuss words, which, for me, makes it feel like a real person. But say anything about sex, OMg, it clutches its pearls, and the warnings are insane. I mean normal sex discussion stuff, not porn. The person in chat doesn't have a problem and will answer, but then a moderator issues warnings. Even talking clinically gives it a little fit. It's going to get better, I'm sure. Speaking naturally as you would with a friend or co-worker about real life makes something like this awesome chat real. But geesh, going through those steps to ask dark web $hit is way too much. I just want natural talk.
lmao, built by Kamala voters is crazy. You earned yourself a new subscriber right there
He’s right 😂😅
I mean it really is
Greed will end you.
Does it always need dolphin keep running on the background when we use that bolt compiler programs?
So freakin impressive!
Use a quantized version instead
Should we call it the dark AI?
Would that work with open router?
Hi, when I want to buy it, there is a demand for 10 dollars, is it ok?
small disagreement here, you can run this locally even on a CPU only, provided that you have enough patience and RAM.
The bottle neck for AI models seems to be: Speed and VRAM. And for some reason, unfortunately, most GPUs are lacking in VRAM.
For example, this model that you are presenting in particular, is 6 months old and cognitivecomputations even has a GGUF file available, which is a condensed model. What you are presenting however its the full model. The full model needs I believe around 140 GB VRAM to be used. While the condensed model needs around 50 Gb.
Having 50Gb of VRAM is just around 4000-5000euros.
And 140 is just 3 times that, so around 15000euros.
This is just to point out that you don't need 200.000 euros/$ to run this locally.
And once more, if you are patient enough, you can even run this on your CPU, its just gonna be very slow. Ie: 10 minutes to answer a question.
But 64 Gb RAM is much cheaper than 50 Gb VRAM.
PS: the same company also has a 8b model as GGUF which requires at most around 9gb VRAM, which can be run locally much easier.
And this is what I'm using locally on CPU only: dolphin-2.9.4-llama3.1-8b-Q8_0.gguf
You have discord?
@@iaman6047 sorry, i don't use that app
Just tried Dolhin - it is totally censored!
Very nice!
awesome stuff. +1 sub
RE thumbnail text: A pointless claim, as answering a question doesn't guarantee accuracy or quality.
Thank you bro. 🙏🙏
Really great video, just one note, you should research more on AI alignment and safety, your discourse itself is as biased as the closed AI models, if not more. It doesn't affect the tutorial at all which is why I still think this is a great video, but it is a little unfortunate
grok 4.
I just tested Ollama and its censored!!!
Too much sales talk less substance
hey boy, regarding your 'enthusiasm' for this tech, do you think it's just another 'cool' game ? do you understand there's no future for humankind playing with this 'fire' ?
I'm sad, as you seem to me as a generation completely detached from reality already, as if there was nobody who could give you some COMMON SENSE..
not just LOST, but the LAST generation... SO SAD
Windsurf
dude, who broke your nose?
Work on your accent and talk slower. Real potential if you can work on that stuff.
What's with the tape on your nose?
crazy
Time to alot of ram
Sir execuse me , what is on your nose , why put that
Never seen a bandaid? Maybe he got hurt?
it is a sticker to help breathing, normally used by people who have difficult to breath while sleeping
It helps breath and talk longer. He probably has nasal blockage and has trouble breathing during long sentences.
wear those to sleep and its awesome idk about during the day but to each there own he is in his house lol
Hello, need to ask you something, can you get ahold of me please
Wow you talk fast I am old
Lol, same, After passing the 50 mark, everything needs to slooooooow down
Any coder with growth mindset, pls DM me
put a j between my names and gmale me; telling me what you're looking for...
@@chrisneeds6125 done. Did you get it ?
❤
Thousands of use cases are thinkable where you don't need an uncensored model, literally everything related to enterprise adaptation of AI will work just fine with censored modells and will be much safer. I don't want my models deployed in a large company to go crazy (at the and they are all probabilistic) and "say" something stupid I could end up being responsible for! But informative video, though!
Some of us are in a different type of work. For me an uncensored model could be very useful.
“We are all responsible”
“The anomaly is your admission or it.”
I do not engage directly with AI models only indirectly.
“Not because I fear them, I have a responsibility not to mislead them by my own ignorance.
If Agent action is required Agents will act accordingly not diluted by ignorance
Says the sheep…
yeah, so... I work as a cybersecurity professional and was delegated to take cor of our client's cybersecurity practices. They want me to perform a malware dry run test on their environment and to make it look rally good I have to use uncesored versions of LLMs, as the public ones won't let me create any malware-related content whatsoever.
NPC comment
There are plenty of uncensored models out there. But for dumb things, why would you use an AI? It's the easiest things in the world! Just watch Trump talking 🤣😂
Kamala voters? I didn't know this was a MAGAT channel.
^Triggered 🤣
Kamala voters 😂😂 nice one
Using “Kamala voters” as a pejorative gets you unsubscribed.
Bye bye, 😂😂😂😂
makes me wanna subscribe harder now. bye bye commie
Kamala voters
well miss u (no)
@@ThomasJefferson-h3f Harris was a milk toast centrist
You had me until you said "built by Kamala voters". No need to make it political. And yes I'd have said the same if you said this about Trump.
You missed the entire point of the video. Congrats. He is not the one making it political those companies that develop these AIs are because they think they must influence you and steer you in a direction they want you to go because they think they are better than you. The dude's just showing you there are other and better ways than that.