Build your own local o1 - here’s how

Поделиться
HTML-код
  • Опубликовано: 21 ноя 2024

Комментарии • 117

  • @DavidOndrej
    @DavidOndrej  26 дней назад +8

    Wanna build your own AI Startup? Go here: www.skool.com/new-society

    • @startingoverpodcast
      @startingoverpodcast 26 дней назад

      Why aren't you using Msty?

    • @aaaaaaaaooooooo
      @aaaaaaaaooooooo 25 дней назад

      Wait, my data is not private with o1? I didn't know that. Where can I check this? Where is this notified to the user, or did they bury it in small text?

  • @eviv8010
    @eviv8010 25 дней назад +43

    nice clickbait

    • @kylev.8248
      @kylev.8248 23 дня назад

      It’s not clickbait tho

    • @bruce_x_offi
      @bruce_x_offi 20 дней назад

      @@kylev.8248 You must be King of fools

  • @samimejri8079
    @samimejri8079 26 дней назад +7

    I just used Llama 3.2 locally and asked about starting a 3d printing business as a 3D beginner. It gave a similar output of what you spent a good time building in this video... Maybe do it the next time, show a before and after response from an LLM.

  • @chrystofferaugusto1194
    @chrystofferaugusto1194 26 дней назад +1

    Btw, the concept you reached in this video of undetermined number of agents is far superior than it was from a video from 5 days ago. Really awesome 👏🏻

  • @indiemusicvideoblog
    @indiemusicvideoblog 26 дней назад +43

    Great! Now build a local agent with lama that can control your computer like Antropic

    • @orthodox_gentleman
      @orthodox_gentleman 26 дней назад +8

      Very doable with Open-Interpreter which is open source and free

    • @Bllakez
      @Bllakez 26 дней назад +5

      @@orthodox_gentleman How much should I pay someone to setup for me?

    • @alexrayoalv
      @alexrayoalv 25 дней назад +6

      I literally did this 6 months ago.

    • @anubisai
      @anubisai 25 дней назад +1

      You build it.😂

    • @marilynlucas5128
      @marilynlucas5128 25 дней назад

      Skyvern!

  • @godned74
    @godned74 24 дня назад +1

    You could try "When providing responses, use concise and primary representations. However, include additional details only when needed to ensure clarity and completeness of the task" and you should get short response's with out compromising the chain of thought.

  • @DCinzi
    @DCinzi 25 дней назад +9

    There is a model called Llama3.3B-Overthinker. I think it would fit the task quite nicely.

    • @JackGamerEuphoriaDev
      @JackGamerEuphoriaDev 25 дней назад

      Is there available in Ollama or hugging face? If you don't mind the question. Thanks by the way for giving directions..

  • @Luxcium
    @Luxcium 26 дней назад +5

    😂 I love the way you have called out your mistake 4:00 it was just so delightful to see you handle it like a boss that I have had to replay it more than 3 times to enjoy the moment... You are definitely a smart man!!! I am eager to see the evolution over time!!! 😅

  • @MrMoonsilver
    @MrMoonsilver 26 дней назад +2

    Cool new format with the presentation man

  • @szebike
    @szebike 25 дней назад +1

    Nice, your contribution to the open source community is awesome!

    • @ysh7713
      @ysh7713 25 дней назад

      opensource?

    • @szebike
      @szebike 25 дней назад

      @@ysh7713 Well kind of ~ better than giving all you data to a faceless big company who wills steal your data 100%.

  • @TheDarkLordAngel
    @TheDarkLordAngel 17 дней назад

    That mark on your nose-it’s almost like a signature, something that’s so naturally you.🖖👍

  • @foxusmusicus2929
    @foxusmusicus2929 24 дня назад +2

    Great video. Which hardware specs do you have? :-)

  • @mariomanca7546
    @mariomanca7546 26 дней назад +2

    If you instruct the agent to use the fewest possible lines, it's likely to eliminate comments, which is suboptimal but expected.

  • @MiNiD33
    @MiNiD33 25 дней назад +1

    "Comments are apologies in code." - Robert C Martin.
    Cursor is helping you.
    Also for the price of the spec of this machine, you can buy an insane number if tokens from anthropic or openai. It might be worth getting people started using a hosted service.

  • @FuZZbaLLbee
    @FuZZbaLLbee 25 дней назад

    You can also use the ollama streaming output to generate text. This way you know what’s the generator is doing.
    Also I think that GPT o1 does more then split up a task and let agents fix the individual tasks. But nevertheless, a nice tutorial on making agents.

  • @eado9440
    @eado9440 26 дней назад +10

    🎉 you actually made it. Thanks

  • @mihaitanita
    @mihaitanita 25 дней назад +10

    So, you've used Claude 3.5 (2024 october update) within Cursor AI Editor to develop a (simple) python script that run some agenting on a 70b model on ollama?
    Where's the o1 in here?

    • @Dancoliio
      @Dancoliio 25 дней назад +4

      o1 is a reasoning model which kept their reasoning 'recipe' private. This is his take (which resonates with the average user of locally owned open source models) to kind of hack the way the 70b model works and simulate reasoning to enhance the final output> a simple method which actually does provide better replies.

    • @BikramAdhikari89
      @BikramAdhikari89 25 дней назад +1

      He is not sharing his research paper published in arxiv my man.

  • @VinceOmondi
    @VinceOmondi 26 дней назад

    Good stuff, Ondrej!

  • @orthodox_gentleman
    @orthodox_gentleman 26 дней назад +5

    Dude, there are very few people that can run nemotron locally….

  • @hrarung
    @hrarung 25 дней назад

    awesome video David! How to train this model based on my dataset? and How to give it a nice UI?

  • @AGINews-TogethWithAI
    @AGINews-TogethWithAI 25 дней назад

    exactly what I needed thank you so much David🎉

  • @bsiix1576
    @bsiix1576 25 дней назад +1

    Maybe I missed it, but what hardware is needed for that nemotron - it is 43GB? Doesn't that mean you need at least that much VRAM? And here I thought I was a baller with my 16GB vram...

  • @EtH-xf6br
    @EtH-xf6br 25 дней назад

    What a beast Macbook you need to have to get such a fast response. I have 7800x3D and 4080 rtx and its waaay slower.

  • @devbites77
    @devbites77 24 дня назад

    Inspiring stuff. Cheers!

  • @hotlineoperator
    @hotlineoperator 26 дней назад +2

    I have test o1 - and it is not so smart. People still need to quide its selections. Big problem with models is censorship, someone else have select what you can do and not to do with these tools.

  • @jefferystartm9442
    @jefferystartm9442 24 дня назад

    Brooooo , there are tools you are behind on . Agent s and Claude computer use?? E2B has an open source version tooo 😊 stay blessed Ondrej

  • @jayhu6075
    @jayhu6075 25 дней назад

    What a great explanation. Thnx

  • @AK-ox3mv
    @AK-ox3mv 23 дня назад +1

    How much you'r local O1 results has more accuracy in comparison to original nemotron 70b and llama 3 3b without uaing chain of thought?
    Was there any improvement in bechmarks like Humaneval and MMLU?

  • @skulltrick
    @skulltrick 25 дней назад

    Very inspiring! Thanks

  • @Visualife
    @Visualife 26 дней назад

    You should use Anything LLM and docker / Open WebUI

  • @FrankDecker-n9e
    @FrankDecker-n9e 23 дня назад +1

    @DavidOndrej, what is your Mac specs? I have a Macbook Pro M3 Max 48 GB..

  • @gaelfalez
    @gaelfalez 25 дней назад +1

    Missing the comparison between result using multiple agents and result using just 1....
    Disappointing. We Don t even know if it is worth the work....

  • @michaeltse321
    @michaeltse321 25 дней назад +1

    You downloade nemotron and not the 70b version which is why you had the error

  • @Plife-507
    @Plife-507 8 дней назад

    I want to build an agent swarm to do coin margined futures btc trading. With each agent handing a serpearte part, ta, market sentiment, execution, risk tolerance, is there a way to keep each model small and only train it to focus on its task?

  • @zechariahprince5671
    @zechariahprince5671 2 дня назад

    We have had AGI for over a year.

  • @11metatron11
    @11metatron11 22 дня назад

    Not a chance with my elderly MacBook Pro. Looks like I need some new gear…

  • @aatheraj1667
    @aatheraj1667 23 дня назад

    Yet, we don't one that could trade Nasdaq futures.

  • @costatattooz840
    @costatattooz840 26 дней назад +3

    locally what hardware do you need to run this at minimum? i have a 64gb ram + 3060 12gb

    • @ticketforlife2103
      @ticketforlife2103 26 дней назад

      Watch the video

    • @H3XM0S
      @H3XM0S 26 дней назад +5

      You'll need over 40gb vram so like 2 x rtx 4090 might be a good option. No idea what hardware is being used in the video. Anyone saying 'watch the video' should provide a timestamp.

    • @bollvigblack
      @bollvigblack 26 дней назад

      this guys is rich. not even joking so

    • @chrystofferaugusto1194
      @chrystofferaugusto1194 26 дней назад +4

      He is on a MacBook Pro bro…

    • @skeyenett
      @skeyenett 26 дней назад

      64GB RAM + 4070 Ti Super (16 VRAM) = Run Nemotron-70b-instruct-q2_K

  • @avi7278
    @avi7278 25 дней назад +4

    Oh yeah im sure openai is quaking in their boots, bro.

  • @MrMoonsilver
    @MrMoonsilver 26 дней назад +6

    Also, I hope the bruise on your nose heals soon. Been a long time now.

    • @Tetardo
      @Tetardo 26 дней назад +1

      I think it’s a medical device that helps him breathe

  • @KiranMohan-dpthinkr
    @KiranMohan-dpthinkr 25 дней назад

    Hey David, how can we reassure clients that their data is secure and won't be shared with the LLM provider for internal training purposes? What steps can we take to ensure their data privacy and address any concerns they might have?

    • @cdunne1620
      @cdunne1620 25 дней назад +1

      You d to ask that in David’s classroom at skoool

    • @KiranMohan-dpthinkr
      @KiranMohan-dpthinkr 25 дней назад

      @@cdunne1620 Sure

    • @haljohnson6947
      @haljohnson6947 24 дня назад +1

      He mentions that in the video like four times

    • @KiranMohan-dpthinkr
      @KiranMohan-dpthinkr 24 дня назад

      @@haljohnson6947 can you mention the specific timeline where he described about it.

    • @KiranMohan-dpthinkr
      @KiranMohan-dpthinkr 24 дня назад

      @@haljohnson6947 pls mention the timeline where he mentioned it.

  • @olivert.7177
    @olivert.7177 26 дней назад +4

    There is also an nemotron-mini model which is only 4b.

  • @slt
    @slt 25 дней назад

    Dadusak!

  • @SjarMenace
    @SjarMenace 26 дней назад +4

    why do you have that thing on your nose?

    • @babyjvadakkan5300
      @babyjvadakkan5300 26 дней назад

      For correcting the nasal path/nose bridge (or something like that

    • @INeedMeme
      @INeedMeme 26 дней назад

      More oxygen bro

    • @cdunne1620
      @cdunne1620 25 дней назад

      Soccer players used to wear them years ago for example Robbie Fowler for Liverpool

  • @immortalityIMT
    @immortalityIMT 25 дней назад

    Cool!

  • @borick2024
    @borick2024 24 дня назад

    Have you had a chance to compare your results against GPT4o?

  • @chrystofferaugusto1194
    @chrystofferaugusto1194 26 дней назад

    You should have a discord community to people share projects and business

    • @chrystofferaugusto1194
      @chrystofferaugusto1194 26 дней назад

      Never mind, now I got the business model on skool. Nice call, thinking about joining it

  • @sushilsharma1621
    @sushilsharma1621 17 дней назад +1

    clickbait or misleading title

  • @MrAndrew535
    @MrAndrew535 25 дней назад

    I want to preserve a million-word dialogue between myself and my ChatGPT on multiple threads while upgrading to your recommendations. How do I achieve that?

  • @themax2go
    @themax2go 5 дней назад

    modern day sham(mer) 👍

  • @TheAsianDude9999
    @TheAsianDude9999 24 дня назад

    What vscode extension are you using for your ai?

  • @aaaaaaaaooooooo
    @aaaaaaaaooooooo 25 дней назад

    Are my prompts on o1-preview used to train the AI even if I opt out? Where do I find this information?

  • @SCHaworth
    @SCHaworth 26 дней назад

    No. Not quite. You have to split the turns.

  • @dorukkurtoglu
    @dorukkurtoglu 16 дней назад

    27:36 LOL🤪

  • @dark_cobalt
    @dark_cobalt 26 дней назад +2

    Already have it lol. Running it on my RX 7900XTX with q4m, but i think ill buy myself 1-2 Radeon W7900 Pro to gain a lot more performance. Alsp you don't need Ollama for it, because it's available in LM Studio and it's downloading from Huggingface.
    Btw what PC hardware specs do you have?

    • @rhadiem
      @rhadiem 25 дней назад

      He's clearly using a 128gb Macbook Pro which can use the memory as vram. He's running un-quantized. How much vram do you have on your gaming gpu? Nobody asked about your hardware bro.

    • @dark_cobalt
      @dark_cobalt 25 дней назад +1

      @@rhadiem Every PC can use the RAM as VRAM. It's how computers work. It's called virtual memory. If the VRAM fills up, the computer uses the RAM as backup memory, to stay stable and not crash. But the RAM is waaaaaaay slower than the VRAM, that's why I am asking him what specs he has. My GPU has 24GB of VRAM and even with the Quant 4M (around 32GB) model of Nemotron 70B my VRAM gets filled up completely and my RAM also to 50GB, which slows down the model to such an amount, that it's painfully slow. He is using a way bigger model, without any issues. If he has a GPU with this huge amount of VRAM, this would be totally understandable, but with the RAM? I don't understand why lol. 😄

  • @danieleduardo9800
    @danieleduardo9800 26 дней назад

    How’d you get composer in the sidebar?

  • @adithyansreeni7491
    @adithyansreeni7491 24 дня назад

    i fkin slep bro

  • @rafaelortega1376
    @rafaelortega1376 21 день назад

    No repo to share the code?

  • @gauravrewaliya3269
    @gauravrewaliya3269 25 дней назад

    How to make local ai with backpropogation feature ( if got wrong stuff, CEO instruct what's wrong and it improve sub local agent by time )

  • @blasterzm
    @blasterzm 20 дней назад

    Lol, that's not how O1 works. You can't tell it in the system prompt

  • @claxvii177th6
    @claxvii177th6 25 дней назад

    1 token per second is too slow for any pratical use...

  • @supermandem
    @supermandem 25 дней назад

    Bro llama is nowhere near o1 wtf

  • @aljosja3353
    @aljosja3353 26 дней назад

    Which computer u can use for local llm

  • @Álvaro-o5e
    @Álvaro-o5e 25 дней назад +3

    99% of free stuff sucks. One of them is this video. 20 minutes to answer "why is the sky blue?"

    • @overunityinventor
      @overunityinventor 23 дня назад +1

      free stuff has a learning curve, it's not everyone's cup of tea

    • @tomwawer5714
      @tomwawer5714 22 дня назад +2

      99% of paid software sucks and it hurts your wallet

  • @ShishuSud
    @ShishuSud 26 дней назад +1

    😇

  • @HimaLoubi
    @HimaLoubi 26 дней назад +1

    😂 you need a graphic card with a price of a Tesla car to run that module locally ; btw you talk like 10.000word/min , 😅

  • @EduardoAlarconGallo
    @EduardoAlarconGallo 25 дней назад

    Title is misleading. You are using Llama which is a LLM but not a Reasoner model

  • @gustavramedies2901
    @gustavramedies2901 25 дней назад

    David i would like to create sales agents,lead generators,receptionist,appointment setters and I want to sell them.Can you help 😢

  • @surendarreddys7298
    @surendarreddys7298 26 дней назад +2

    1st one to comment 😄

  • @stefanschz7589
    @stefanschz7589 25 дней назад

    Awesome!

  • @TheBhushanJPawar
    @TheBhushanJPawar 9 дней назад

    I am getting following error:
    bhushan@Bhushans-MacBook-Pro ~ % ollama run nemotron
    Error: llama runner process has terminated: signal: killed

    • @TheBhushanJPawar
      @TheBhushanJPawar 9 дней назад

      After clearing some memory now it's started working...