[ML News] Llama 3 changes the game

Поделиться
HTML-код
  • Опубликовано: 13 май 2024
  • Meta's Llama 3 is out. New model, new license, new opportunities.
    References:
    llama.meta.com/llama3/
    ai.meta.com/blog/meta-llama-3/
    github.com/meta-llama/llama3/...
    llama.meta.com/trust-and-safety/
    ai.meta.com/research/publicat...
    github.com/meta-llama/llama-r...
    llama.meta.com/llama3/license/
    about. news/2024/04/met...
    minchoi/status/17...
    _akhaliq/status/1...
    _philschmid/statu...
    lmsysorg/status/1...
    SebastienBubeck/s...
    _Mira___Mira_/sta...
    _philschmid/statu...
    cHHillee/status/1...
    www.meta.ai/?icebreaker=imagine
    OpenAI/status/177...
    OpenAIDevs/status...
    OpenAIDevs/status...
    CodeByPoonam/stat...
    hey_madni/status/...
    cloud.google.com/blog/product...
    altryne/status/17...
    xenovacom/status/...
    minchoi/status/17...
    www.udio.com/
    www.udio.com/pricing
    Links:
    Homepage: ykilcher.com
    Merch: ykilcher.com/merch
    RUclips: / yannickilcher
    Twitter: / ykilcher
    Discord: ykilcher.com/discord
    LinkedIn: / ykilcher
    If you want to support me, the best thing to do is to share out the content :)
    If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
    SubscribeStar: www.subscribestar.com/yannick...
    Patreon: / yannickilcher
    Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
    Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
    Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
    Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
  • НаукаНаука

Комментарии • 150

  • @YuraCCC
    @YuraCCC 20 дней назад +55

    I had my doubts about Zuck, but check him out now-championing open source AI like a boss! Maybe he should just grab the name 'Open AI'-that is, if nobody's snagged it yet

    • @demetriusmichael
      @demetriusmichael 20 дней назад +7

      The training data will be a legal nightmare on these proprietary things. Making it open source is the only way in this case.

    • @antonystringfellow5152
      @antonystringfellow5152 19 дней назад +11

      I know, this latest version of Zuck is amazing!
      I watched an interview of him talking about Llama 3 and he was so human-like

    • @peterfireflylund
      @peterfireflylund 19 дней назад +2

      Opener AI?

    • @monad_tcp
      @monad_tcp 19 дней назад +2

      @@antonystringfellow5152 yeah, his avatar got a massive upgrade, he's almost human now

    • @Wobbothe3rd
      @Wobbothe3rd 19 дней назад +3

      He was like this in VR too. Throughout VR development Meta/Facebook published many things open source, including computer vision models.

  • @tantzer6113
    @tantzer6113 20 дней назад +100

    “If you don’t know what I’m talking about - and I don’t know why you wouldn’t…” I don’t know it because you’re my main source for important developments in machine learning.

    • @tantzer6113
      @tantzer6113 20 дней назад +10

      PS I don’t mind getting news with delay. I like it that you get into algorithms, capabilities, and technical overviews.

    • @mriz
      @mriz 19 дней назад +6

      if you on Twitter and follow few folks in llm community, it is almost impossible to escape this hype and news on your timeline

  • @mikethedriver5673
    @mikethedriver5673 20 дней назад +76

    Llama 3: they had no moat

    • @float32
      @float32 20 дней назад +3

      This wouldn’t be such big news if they didn’t have a moat that is just now being bridged.

  • @GuagoFruit
    @GuagoFruit 20 дней назад +16

    The next revolution imo definitely needs to be getting things to run locally with any sort of fidelity.

    • @Aphixx
      @Aphixx 19 дней назад

      If Stable Diffusion is a good historical example, then we should see some pretty significant perf improvements as soon as people (nerds) decide to stubbornly mess with it until it works.

  • @olcaybuyan
    @olcaybuyan 20 дней назад +26

    I am especially happy that Llama 3 supports multiple languages :-) Most open access or open source models are English only and no real alternative to OpenAI GPT.

  • @mikayahlevi
    @mikayahlevi 20 дней назад +6

    The anti-open source AI safety person impression at 13:48 is too accurate🤣

  • @woolfel
    @woolfel 19 дней назад +5

    There a numerous papers about data quality and data selection going back to 2000. Good to see people realize quantity is not the "end all" of training LLM. Creating a good dataset has always been an art. Will the filters and pipeline for processing the data get open sourced?

  • @vladimirtchuiev2218
    @vladimirtchuiev2218 19 дней назад +3

    Mixture of Depth is a promising direction in modularizing LLMs, you could basically use only part of the model for specific applications

  • @YuraCCC
    @YuraCCC 19 дней назад +3

    Those t-shirt stripes are an example of reverse CAPTCHA - it spins humans right into dizziness and blackout, but AIs? They just keep watching and learning.

  • @thirdeye4654
    @thirdeye4654 19 дней назад +3

    Udio is great in my opinion. You don't let the AI create whole songs, but segments (of around 33s). It usually creates 2 variants at the same time. You then can extend those segments (before or after); either by a midsegement, intro or outro. You can even insert your own lyrics and it works like a charm. If you are happy with the song, you then can "publish" it and even pick a text-to-image cover art. I love that stuff.

  • @Hacking-Kitten
    @Hacking-Kitten 20 дней назад +1

    thank you very much for your videos! Could you hint me to some of the techniques that you find most promising in context length extension?

  • @propeacemindfortress
    @propeacemindfortress 19 дней назад +2

    As always the best curated ML news.
    Love your expertise and humor :D
    oh, and... more fish for Yann LeCat!

  • @pietrorse
    @pietrorse 18 дней назад +2

    if the training text would be plain ascii, and average token length 4 characters , the training dataset would have been ~ 55 terabytes plain ascii. wow!

  • @Embassy_of_Jupiter
    @Embassy_of_Jupiter 16 дней назад +2

    I think a valid reason not to build models on 95% English data is that it could significantly influence the world view and "zeitgeist" of the model in all languages. it makes sense to have fully local models to not homogenize the world even further with US thought.

  • @marsandbars
    @marsandbars 20 дней назад +1

    20:40 An alternative to this is using a documented SDXL Turbo workflow with ComfyUI locally, which can produce images of decent fidelity at even faster speeds than this demo, at least on my 3090.

  • @Voljinable
    @Voljinable 14 дней назад

    I really like the way you frame meta/zuck making llama 3 open source. They choose the option that is best for the company, but whats best changes. For research and optimization an open source model is better. For profit a closed source one is better.
    What they do just depends on what is best at the moment, but i like that its open source for llama 3 right now and hope it will stay that way!

  • @Nico79489
    @Nico79489 19 дней назад +1

    Cool to have great open LLMs. Unfortunately, this is not the case for image generation models: all the recent advanced models like SDXL or Photoshop are not commercial free ones.

  • @sebastianp4023
    @sebastianp4023 19 дней назад +7

    we are getting to model sizes where they might as well just be compressed lookup tables

    • @andreicozma6026
      @andreicozma6026 17 дней назад +1

      That’s essentially what they are regardless. That’s how attention works. Tokens are used as queries against keys for computing similarity scores and then the values are summed up based on those scores. It’s literally essentially keying into a learned “dictionary” index

  • @OperationDarkside
    @OperationDarkside 19 дней назад

    Interesting times in many ways.

  • @yaxiongzhao6640
    @yaxiongzhao6640 20 дней назад +12

    Right at the moment, Phi-3 changed the game again!...

    • @quickpert1382
      @quickpert1382 20 дней назад

      huh yeah, the 7b standard is no more a standard. It's a pretty model really that can be runned also on 4GB VRAM gpus.

  • @brandonheaton6197
    @brandonheaton6197 20 дней назад

    You should recapitulate the math from code synthesis project from MIT using llama 3 because that would be lit

  • @yurykorolev
    @yurykorolev 19 дней назад

    thank you

  • @auresdz701
    @auresdz701 19 дней назад

    The scores are high somehow and it makes me wonder whether they specially aligned the curated and the validation data when doing instruction finetuning!!

  • @pablowentscobar
    @pablowentscobar 20 дней назад +2

    Really enjoy these ML News vids. Great for keeping AI normies like me up to speed.

  • @huveja9799
    @huveja9799 20 дней назад +7

    and we are witnessing a t-shirevolution .. I'm still dizzy from seeing those stripes ..

    • @monad_tcp
      @monad_tcp 19 дней назад

      its to confuse visual learning algorithms

    • @huveja9799
      @huveja9799 19 дней назад

      @@monad_tcp
      Well, I hadn't thought of that, but now that you mention it, it may well be ..

  • @snarkyboojum
    @snarkyboojum 20 дней назад +1

    🎉

  • @pawelkubik
    @pawelkubik 15 дней назад

    I'm not sure why people have reservations about Phi specifically. We don't know what data were used to train the other models and to what extend their performance rely on "fitting to the test dataset". Did OpenAI ever admit what role does the human-curated part of their training dataset play in the model's performance?

  • @FilmFactry
    @FilmFactry 20 дней назад

    If we had access to GPT 4 Weights and biases how different would it be to the LLama 3 Weights? I use all the LLMs and find them pretty much even. Claude is fast but limited. I find Gemini pro a little dumber.

  • @JorgetePanete
    @JorgetePanete 19 дней назад +2

    I don't even know what benchmarks to believe

  • @Iaotle
    @Iaotle 19 дней назад +1

    This channel is one of the rare ones that I genuinely watch, amidst hours and hours of clicbait recycled AI hype videos :)

  • @cherubin7th
    @cherubin7th 19 дней назад

    Very nice

  • @214F7Iic0ybZraC
    @214F7Iic0ybZraC 11 дней назад +1

    6:23 "There is enough research to show that once you are capable at one language, you only need quite little data on another language to transfer that knowledge back and forth"
    Does anyone give me related papers to this argument? I am interested in cross-lingual transfer in language models.

  • @Embassy_of_Jupiter
    @Embassy_of_Jupiter 16 дней назад

    I find the license really fair. Their models will be obsolete by next year anyway. I think it is only appropriate that they should profit off of it until then, for having developed such a great step forward in local LLMs.

  • @thomasmuller7001
    @thomasmuller7001 19 дней назад

    helli hello!

  • @sebastianp4023
    @sebastianp4023 19 дней назад +3

    "we have used 15T tokens from publicly available sources . . . pls don't look to close . . ." 😂

    • @sebastianp4023
      @sebastianp4023 19 дней назад +1

      that's a big "trust me bro"

    • @skierpage
      @skierpage 19 дней назад

      The open-source dataset "The Pile" contained the 108 GB Books3 shadow library of approximately 196,640 pirated books, most of which are still under copyright. It is a "Publicly available source," so the lying executives can shrug and screw over authors. They have $billions to spend on Nvidia chips but won't even buy e-books of the creative works they train on.
      (It's hard to tell what the current status is of The Pile and Bibliotek... intentionally so. The first Llama model trained on Books3. Rumor is all the AI companies saved a copy of Books3 before scripts pointing to the dataset were deleted, and Nvidia is being sued over training its NeMo LLM family.)

  • @theosalmon
    @theosalmon 20 дней назад

    Though we appreciate this greatly. Go away and go back on vacation until you're rested!

  • @propeacemindfortress
    @propeacemindfortress 19 дней назад +3

    I tried to use existing llm's to prepare a fine tuning dataset specific on theravada thought, philosophy and practice, turned out that all models I tried were incapable of capturing any nuances in the meaning of words and concepts but stuck diligently to "their own philosophical framework of interpretation" regardless of the different system prompt, regardless of feeding scriptures, papers or video transcripts, they couldn't even identify the proper questions, so please don't mind me on disagreeing that language alone, maybe even regardless of percentage distribution, doesn't cut it on any task that require cultural, philosophical or religious understanding... not even talking about the human component in it... translation ofc is a totally different thing, used phrases and stuff can be captured quite well... the underlying unspoken human component not so much.

    • @klausschmidt982
      @klausschmidt982 19 дней назад +1

      Yeah, that seems like a very hard task for such a model. Sometimes you have to properly manage your expectations with these things

    • @propeacemindfortress
      @propeacemindfortress 19 дней назад +1

      @@klausschmidt982 I've given up on it, current models neither have the capability nor the training data that would allow for finer nuances on rare topics... future models might be capable but with the move to synthetic data... 🤷‍♂ very doubtful that future architectures can do it after the synthetic data has been flattened into a unified interpretation... then we will have american and chinese buddhist interpretations 😂
      so I join into that "Yeah," was a nice idea but specialized things might need a lot more human work and training investment than I can afford.
      Have a good, thanks for reply.

  • @dr.mikeybee
    @dr.mikeybee 19 дней назад

    Do I really need a better LLM than Llama3 70B? If I have a good agent with search, RAG, and memory, isn't that good enough?

  • @DanielWolf555
    @DanielWolf555 19 дней назад

    Does Llama 3 have any vision capabilities like GPT4?

  • @pandalayreal
    @pandalayreal 19 дней назад +1

    Old news there is Phi-3 now.

  • @alan2here
    @alan2here 19 дней назад +5

    Lama 3 BS generator v5
    don't worry, I'll include the Lama 3 at the start

  • @semaraugusto
    @semaraugusto 19 дней назад

    the 8B param number strikes me as a bit weird. Why not 7B to make a fair comparison between the models? did they not achieve good results with 7B or did they just not test and decided in advance to compare against weaker models?

    • @BrandnyNikes
      @BrandnyNikes 19 дней назад +2

      They have a larger vocabulary and through that more parameters in the embedding layers. The other architecture (number of layers and heads) should still be the same.

    • @skierpage
      @skierpage 19 дней назад

      Fat-fingered typing error? 🙂

  • @OmicronChannel
    @OmicronChannel 19 дней назад

    ScreenAI seams to be everything you need for a agent capable to perform UX interaction. That's exciting and disappointing at the same time (I agree that the accessibility via Google Vertex AI is very limited).
    Why can Google not provide a SIMPLE on-the-go-payment-API-call solution like OpenAI and Anthropic??

  • @timothywcrane
    @timothywcrane 20 дней назад

    Be sure to DL the Purple and Purple Free Versions by getting two emails for each model set requested. But be prepared for TB instead of GB worth of dls.

  • @logangarcia
    @logangarcia 18 дней назад

    Pls add timestamps to the video

  • @mimszanadunstedt441
    @mimszanadunstedt441 17 дней назад

    Its very easy to get models to hallucinate when asking for music recommnds. LLama is no different.

  • @BinarySplit
    @BinarySplit 20 дней назад

    As a language learner, it feels not so great for other languages, and I question whether 5% non-English is enough. Discuss German and it'll often make mistakes explaining the grammar. Try to talk in Chinese and it'll switch back to English at every opportunity.
    Hopefully these are just issues with the prompt or instruction tuning that will be fixed by other fine-tunes, but for now I'm going back to Mixtral and ChatGPT...

  • @kbizzy111
    @kbizzy111 19 дней назад

    More paper reviews please

  • @unvergebeneid
    @unvergebeneid 20 дней назад +1

    I would be careful not to just laugh safety guys off as these silly modern-day luddites. Anyway, can't wait for llama3-uncensored:400B. But then again, I just want to do cool stuff and see the world burn, so don't mind me! 😊

  • @TiagoTiagoT
    @TiagoTiagoT 20 дней назад

    What if they keep overtraining the smaller models until they plateau?

    • @JurekOK
      @JurekOK 20 дней назад +1

      that's literally what they did with lama3

    • @TiagoTiagoT
      @TiagoTiagoT 19 дней назад

      @@JurekOK I thought they said they hadn't plateaued yet by the time they stopped training?

  • @skierpage
    @skierpage 19 дней назад

    @13:16 "and with the past with Llama 2 we've already seen that all these people who have announced how terrible the world is going to be if we open-source these models have been wrong -- have been plainly wrong. The improvement in the field, the good things that have happened undoubtedly, massively, outweigh any sort of bad things that happen, and I don't think there's a big question about that. It's just that the same people now say 'Well okay not this model, but the next model... is really dangerous to release openly.' So this is the next model, and my prediction today is it's going to be just fine, in fact it's going to be amazing releasing this."
    @Yannic, That's quite a set of claims. What are all "the good things that have happened" beyond technical advances like more efficient models? I'm sure millions of people are more productive and writing better (or at least spewing grammatically correct verbiage), but are there actual studies of the good things, both with AI in general and open-source models? Meanwhile it's unclear how long it will be before we discover the awful uses of AI in the 2024 election cycles in major countries and other disinformation campaigns.
    I'm willing to believe your take, but some evidence for your optimism would be nice.

  • @andytroo
    @andytroo 20 дней назад

    i think there is no need to make it private - the moment the model requires more than ~24gb of ram to run, it is out of the hands of most businesses to directly use - you can release the weights and you can privately run the poor models quickly, the medium models slowly, or you can laugh as your hardware runs out of ram trying to run the full Facebook model ...

  • @zyxwvutsrqponmlkh
    @zyxwvutsrqponmlkh 20 дней назад +1

    Llama can pretend to run code, I got it to simulate a dos prompt and play text adventure games.

  • @diga4696
    @diga4696 20 дней назад +3

    Wow it's only been two days since llama 3 release!? I swear it felt like a month ago..

  • @timothywcrane
    @timothywcrane 20 дней назад

    If you are the first to produce the MMLU is that an achievement or shameful? luv that OpenAi just added reverse "gas fees".

  • @Dogo.R
    @Dogo.R 20 дней назад +1

    Remember math results not utilizing wolfram are meaningless. Since their results will be childs play compared to results using worlfram as a tool.

    • @eadweard.
      @eadweard. 20 дней назад +2

      Cannot tell what you are trying to say.

  • @mauricioalfaro9406
    @mauricioalfaro9406 20 дней назад

    0:05 The usual little cynical chuckle

  • @SimonJackson13
    @SimonJackson13 19 дней назад

    WinAmp ....

  • @henrischomacker6097
    @henrischomacker6097 20 дней назад

    I really hoped that the small model would be better in the german language but unfortunately not good enough that I would prefer to talk to it only in german and don't think that some of my only german speaking friends would like to talk to it.
    Probably the bigger model is much better in foreign languages but unfortunately that one is again too big for a 4090. It's a pitty.
    So having our own app available via VPN at home from the mobile phone to let it also use our other non english speaking friends is still not really an option. Normal people are ignorant and would laugh at me. - Maybe not if I would give a female assistant an erotic french voice? ;-)
    But I must say that despite of that I really like the instruct model but the chat model gave me a lot ! of bs.
    But maybe some parameters tweaking may change that. Haven't had the time to play around with it more right now. But we'll see...

  • @JacobAsmuth-jw8uc
    @JacobAsmuth-jw8uc 20 дней назад

    Immediately passed by Phi 3

  • @timothywcrane
    @timothywcrane 20 дней назад

    If it wasn't for open weights, crazies banging away on 1050tis and pis like myself would have never been "allowed".

  • @user-rk4ux3cj8q
    @user-rk4ux3cj8q 15 дней назад

    8:25

  • @christopherknight5526
    @christopherknight5526 20 дней назад +2

    Yikes! Missed the phi-3 announcement..

    • @tomaszkarwik6357
      @tomaszkarwik6357 20 дней назад +1

      The second half of the video is about phi-3

  • @alan2here
    @alan2here 20 дней назад

    Yeah you can get it in Africa, US, Asia, but not in the UK

  • @hypno5690
    @hypno5690 20 дней назад +3

    I can't care about LLMs until we get personal assistants that are completely customizable and fully transparent with no censorship.

    • @lonelybookworm
      @lonelybookworm 20 дней назад +7

      But you can? It just requires a beefy PC

    • @Rhannmah
      @Rhannmah 19 дней назад

      Well you should care, because large language models and their evolutions are about to take over your life.

  • @Embassy_of_Jupiter
    @Embassy_of_Jupiter 16 дней назад

    I tried a 2 bit quantized 70B model and it blew my mind how good it still was

  • @TiagoTiagoT
    @TiagoTiagoT 20 дней назад +2

    13:23 To be fair, you can only end the world once, and after that happen you (luckily) won't be around to witness the outcome. Black swan with a touch of the anthropic (no pun intended) principle; you can only be alive to witness the state of the world if the world you live in has not been ended yet; once that happens, you likely will not be in conditions of acknowledging that it has happened; it is not something you can look back and see it after the fact, you can only experience it the first time it happens, and that is if you will be able to have any experience at all while it is happening.

    • @Hexanitrobenzene
      @Hexanitrobenzene 20 дней назад +1

      Yannic does not believe that AI can cause existential risk. With this generation of models, he is probably right, but the trend is not promising...

    • @TiagoTiagoT
      @TiagoTiagoT 19 дней назад +1

      @@Hexanitrobenzene Humanity is blindly approaching the "tickling the dragon's tail" territory; but unlike with the Demon Core, once it goes criticial, it won't be just a matter of a few lab workers suffering of radiation exposure.
      Who knows, maybe we'll luck out and go the comic book route and gain godly super-powers; but in the real world, the odds aren't looking good.
      Don't get me wrong, I'm not saying we would be safer with just the big corps handling the development of the future, or the end of the future, of humanity; we're fucked either way, Molloch, you know?

    • @Rhannmah
      @Rhannmah 19 дней назад +1

      @@TiagoTiagoT relax. A language model doesn't have the agency nor the tools to make actions in the real world, and even if it did, it wouldn't be able to react and incorporate the results.
      We're quite far from the situation you're thinking of. Doesn't mean you don't have to think about it because it's pretty much undoubtedly coming in the future, but there is nothing to freak out about. The only actual worry currently to be had in the immediate future is the amount of people who become unemployable because of the performance of generative models.

    • @TiagoTiagoT
      @TiagoTiagoT 19 дней назад

      @@Rhannmah You must not be following the news closely in recent years; people being giving them all those abilities bit by bit at a faster and faster rate.
      Unemployment is a concern; but that's just the bathtub starting to overflow; meanwhile the air faintly smells like gas and there are lit candles all over the place....

    • @skierpage
      @skierpage 19 дней назад

      @@Rhannmah even without agency for the unknowable goals of an ASI, current AIs allow bad actors to do bad things with minimal effort. And what's really concerning is Google/Meta/Microsoft/OpenAI are run and owned by billionaire sociopaths whose goals include: getting you hooked on a stream of content so they can know all about you to monetize your profile; avoiding any meaningful government regulation; and stopping the redistribution of their wealth. Now imagine even worse actors and political campaigns having similar capabilities.

  • @syncrossus
    @syncrossus 19 дней назад

    > The good that's come from these models far outweighs the bad
    Really? Don't get me wrong, I think language models are great but I know people have lost their jobs over this, we've seen data breaches, people are falling in love with AI personas, one guy was driven to suicide, scams are on the rise... I have no shortage of bad things to mention that have come out of AI, but I can't think of anything truly good. I mean I'm sure a good number of people are a bit more productive in their work, but that doesn't seem like a worthy tradeoff to me.
    I also disagree with your cavalier attitude towards safety based on past experience. It seems possible to me that as these models become more powerful, we may attain the AI singularity (ability for self-improvement). Once that happens, past experience will have very little wisdom to impart on us regarding what will happen next. It's very possible that we're worried for nothing, but given the scale of what's at stake, it only makes sense to be cautious.

  • @seventyfive7597
    @seventyfive7597 20 дней назад +6

    Correction: this is not open source. Open weights without the release of processes or code, is akin to a binary library, you can build with it, but you depend on it without knowledge. Open source should be at the minimum open like Grok 1.0, otherwise it is quite an evil way of sourcing, getting regulation ease from the govt and dev ideas from the community, but keeping them dependent on the "binary". Same goes for Mistral.

    • @eadweard.
      @eadweard. 20 дней назад

      Not sure what you mean. If they release no architectural code/information, how can you use it at all, even for inference?

    • @Hexanitrobenzene
      @Hexanitrobenzene 20 дней назад +3

      It's "open weights". Classical code does not have an analogy. With open weights you can do quite a lot of customizations, in contrast to binary library, which would require an extremely difficult task of reverse engineering to do any modification.
      True open source would be revealing the training data, the model code and the details of training processes.

    • @seventyfive7597
      @seventyfive7597 19 дней назад

      ​​​@@Hexanitrobenzene I have to disagree on your first paragraph, what you're referring to is akin to the include file that goes along the binary. It's closed code. For comparison look at the amount of information X-AI released along Grok 1.0 . Mistral and Meta are local closed code, while OpenAI are SaaS closed code, but both are closed.

    • @clray123
      @clray123 19 дней назад

      It is much worse with Llama becuase they reserve a right to terminate your license by accusing you of violating Acceptable Use Policy. Which they can change basically at any time. They also force you to defend them in court (indemnification) if your users sue them. Which could be a big deal for a small company.

    • @Hexanitrobenzene
      @Hexanitrobenzene 17 дней назад

      @@seventyfive7597
      By "modifications" I meant fine tuning, which can be done way cheaper (

  • @TheEarlVix
    @TheEarlVix 20 дней назад +2

    Spun the 8b parameter Llama 3 model up locally with Ollama, asked it to summarise some text and it just spat out garbage. Tried it on Q4, Q8 and FP16 quantizations and apart from "Why is the shy blue?" everything else I tried was a totally rubbish response. Also found that it often went into a long, seemingly endless, cycles of outputting the same paragraph of nonsense over and over again.
    Can't speak for the 70b parameter model but the results with the 8b show that this smaller version is definitely not suitable for prime time.

    • @whoareyouqqq
      @whoareyouqqq 20 дней назад +1

      ++++++ same result

    • @whoareyouqqq
      @whoareyouqqq 20 дней назад +1

      Phi3 significantly better

    • @TheEarlVix
      @TheEarlVix 20 дней назад

      @@whoareyouqqq Yes I tried Phi3 for a sanity check because it all seemed a bit odd especially after all the Llama3 release hype and Phi worked fine, not perfect but definitely without complete garbage issues.

  • @TylerMatthewHarris
    @TylerMatthewHarris 20 дней назад

    Onest

  • @jermunitz3020
    @jermunitz3020 20 дней назад +2

    🦙

  • @halocemagnum8351
    @halocemagnum8351 19 дней назад

    The obsessively blaming the safety crowd IMO is kinda cringe and lame. It’s obvious why Open AI and Anthropic don’t open source their models, it’s for profitability reasons. They don’t even pretend like it isn’t and they don’t use safety as an excuse. Constantly blaming people who care about safety is gonna lead to a rude awakening when Facebook realizes it’s tanked enough competitor market share and announces its own fully closed off monetized models.

  • @definty
    @definty 19 дней назад

    Phi 3 is out and beats llama 3 7b model already, it's like a week after llama 3 release.

  • @XOPOIIIO
    @XOPOIIIO 20 дней назад +4

    People who thought that nuclear proliferation will cause nuclear war were wrong, they were wrong all along.

    • @eadweard.
      @eadweard. 20 дней назад +1

      Well they're weren't wrong that it was extremely risky. It just didn't happen to happen.

    • @XOPOIIIO
      @XOPOIIIO 20 дней назад

      @@eadweard. Exactly

    • @eadweard.
      @eadweard. 20 дней назад +2

      @@XOPOIIIO Cannot tell what you are trying to say.

    • @XOPOIIIO
      @XOPOIIIO 20 дней назад

      @@eadweard. How old you are? What is your IQ? Did you watch the video?

    • @oncedidactic
      @oncedidactic 19 дней назад

      Is nuclear war possible without the proliferation of nuclear weapons? If you’re going to talk about out logical causes, be specific about the claim.

  • @dr.mikeybee
    @dr.mikeybee 19 дней назад

    Regarding Llama3, Sam looked scared out of his mind in a recent video. ClosedAI sucks.

  • @ulamss5
    @ulamss5 20 дней назад +4

    Just a reminder that openai constantly nerfs their production models. Beating today's cgpt3.5 doesn't mean it beats the launch cgpt3.5 which formed our first impressions.

    • @Phobos11
      @Phobos11 18 дней назад

      Launch version doesn't exist anymore and will never exist again, so not really sure there's a point to compare to a ghost. As long as open source models keep getting better, it's progress :D

  • @Effectivebasketball
    @Effectivebasketball 18 дней назад

    No, it is not. Just another player and not the best of all.

  • @haldanesghost
    @haldanesghost 19 дней назад

    This has really changed my perspective (from pessimistic to a little more optimistic); both the dunking on the doomers but also, by releasing these models and being unapologetic about it, we can start to get rid of the mystique that has been given to them because of this Wizard of Oz game OpenAI was playing. letting people learn to deal with these systems by themselves and see what’s under the hood. I’m confident that’s going to lead to the more efficient use of these systems, something that’s achievable when the name of the game is just “MAKE MODEL BIGGA! MOAR DATA! MOAR CIMPUTE!!!” The power of having generalized approximators is wasted if all you use them for is effectively brute force on a graph.
    The thing about data quality cannot be overstated. If we can be rational adults for a second and drop the hype, the fact of the matter is that calling these systems “artificial intelligence” and acting as if they’re machine god doesn’t change the fact that they’re not intelligent, doing anything close to it, nor have any of the cognitive properties the hypers and the doomers keep attributing to them. They are just functions; literal f(x)’s (granted big spicy ones). You’re fitting a function to data under some optimization procedure.
    The relevance of the data is that in neural networks (and siblings) we have mathematical guarantees about them being able to fit anything *(within reason)- they’re general purpose approximators. That’s a super useful thing to have! Quite powerful. You know what the weakness is though? **You can fit anything**. Anything includes things you as a human don’t want! But if the thing you don’t want generated a signal that can be used to minimize loss, then the systems doing things you don’t want, is actually working as intended.
    Being able to fit anything means that the function you’re using to fit ceases to be of central importance m, completely shifting the burden onto the data itself. Fitting these models (assuming you pulled it off) is just moving the data distribution from a data explicit format, to a functional representation.
    Hopefully, this leads to a sobering if the field and maybe an attempt is made to return to symbolics with the gains of these models and maybe, just maybe, an artificial system could not just sound like a human, but reason like one.

    • @skierpage
      @skierpage 19 дней назад

      The only way these LLMs can successfully predict the next word in an answer or conversation on almost any subject expressible in a sequence of characters (!!) is by being intelligent and having cognitive properties like understanding. We are all SO BLOODY TIRED of people claiming otherwise; if you deny AI "intelligence" and "understanding," you are making up your own definitions so you can move the goalposts to another sports stadium altogether. Just say that AI intelligence, understanding, and cognitive properties are not the same as human intelligence and human understanding (yes, we know), and give us your take on how they fall short.
      (I tried to use Bing Copilot Chat to find the pithy tweet where an AI expert trashed your tired wisdom that these things aren't intelligent... and it couldn't find it.)

  • @markburton5318
    @markburton5318 20 дней назад +1

    I don’t think you can say there is no harm from open source AI. It is too early. It was inevitable anyway that there would be leaks. But people will try to cause harm. Kids in the US machine gun schools in order to be famous, so of course someone will try to create ‘terminator’. You laugh about EU legislation but at least certain activities have to be illegal otherwise they cannot be stopped. The effort on safety has to be stepped up.

  • @ivanstepanovftw
    @ivanstepanovftw 20 дней назад +2

    Can you please put summary to the beginning or end of your videos? It is so boring to listen "wow it is so good!" or "best model" etc.

    • @mikethedriver5673
      @mikethedriver5673 20 дней назад +6

      Adding this may improve your videos possibly, but I disagree that it is boring. I very much enjoy your videos 😊

    • @ivanstepanovftw
      @ivanstepanovftw 19 дней назад

      @@mikethedriver5673 OK! Here is spoiler for LLaMA 4/Mistral 2 7B/Phi/etc: OH MY GOD, IT IS SO MUCH BETTER, IT BEATS GPT-3.5.

  • @RozenKrieg
    @RozenKrieg 19 дней назад

    Llama 3 is a real downgrade

  • @tunestar
    @tunestar 20 дней назад

    Kinda late video.

  • @buttpub
    @buttpub 20 дней назад

    well, llama3 compared to mistral does not really perform much better, 7b and 8b that is.