Claude 3 Just Released - “Outperforms GPT-4 And Gemini in Every Category!”

Поделиться
HTML-код
  • Опубликовано: 3 мар 2024
  • There is a brand new version of Claude, Claude 3 just released by Anthropic and it’s a pretty big upgrade from Claude 2.
    With Claude 3 beats GPT-4 and Gemini in the top benchmark testing.
    Claude 3 comes in three different models. Haiku, Sonnet, Opus.
    All models of Claude 3 have vision capabilities.
    The best model, Opus requires a paid subscription to Claude 3 Pro.
    In this video, I'll test its Vision capabilities, writing ability, image-to-code capabilities, and coding capabilities.
    You can read the full blog post here: www.anthropic.com/news/claude...

Комментарии • 44

  • @JOHN.Z999
    @JOHN.Z999 4 месяца назад +4

    I believe that the launch of GPT-5 will take place next week, but it would be amazing if it happened this week. That way, in addition to celebrating the one-year anniversary of GPT-4, we would have the chance to constantly talk about GPT-5. I hope that GPT-5 will exhibit reasoning far superior to all currently available models. With this, OpenAI would quickly silence critics and envious voices.

  • @EDashMan
    @EDashMan 4 месяца назад +1

    Love your benchmark and comparison tests, simple and not too long and effective. Seen a bunch of ai vids released similar times around the Claude model but soon as I saw yours I had to click first. You reckon you could do more coding examples ?

  • @tomgreen8246
    @tomgreen8246 4 месяца назад

    Been playing with it at work today... its exceptional. Surprised and impressed. Wish it had web browsing though

  • @dhruvgupta4170
    @dhruvgupta4170 3 месяца назад +3

    Claude is not available in Canada.

    • @whellockroad
      @whellockroad 3 месяца назад

      Wonder how these geographic decisions are made. It's available in my two homes: Thailand AND Sri Lanka....hmmmm.

  • @mohamedyasser840
    @mohamedyasser840 4 месяца назад +2

    good job man

  • @jd_real1
    @jd_real1 4 месяца назад +2

    I'm impressed with it. I asked Claude how to fix my car and the response matched GPT 4 and they were right. i also uploaded a picture of a mole and asked if it was skin cancer. Claude said that it didn't display markings of cancer but i need to ask my doctor. GPT 4 straight up told me nothing and it violated its TOS and said to only go to the doctor. I also went to the doctor and he said it wasn't cancer. I'll probably switch

  • @micbab-vg2mu
    @micbab-vg2mu 4 месяца назад +1

    I am surprised how good it is:)

  • @GamerEngineer1345
    @GamerEngineer1345 4 месяца назад +7

    Can't wait for perplexity to add claude 3 into their group of models that can be used in copilot mode its gonna be epic

    • @totempow
      @totempow 4 месяца назад

      Its in Poe as of now. Files is a little odd though. not taking pictures. wah wahhhh.

    • @timooothy1234
      @timooothy1234 4 месяца назад

      ​@@totempow In my knowledge it accepts docx. (Microsoft document) Files

    • @JaddOnTheTrackakaJOTT
      @JaddOnTheTrackakaJOTT 4 месяца назад

      what if i told you they already did on their web browser

    • @totempow
      @totempow 4 месяца назад

      I'd be a little happier.@@JaddOnTheTrackakaJOTT

    • @GamerEngineer1345
      @GamerEngineer1345 4 месяца назад

      @@JaddOnTheTrackakaJOTTjust saw it but limited for 5 queries per day

  • @oryanol
    @oryanol 4 месяца назад

    Good video as usual. Thanks for the details

  • @FamousTVvoice
    @FamousTVvoice 3 месяца назад

    Referring to 8':00" ; This might be very subjective but historically, the usage of a Post Script dates back to when correspondence were handwritten or typed, making it cumbersome to incorporate any afterthoughts or additional information into the body of the letter without rewriting the entire message. I get it, today its used to stress or highlight a point, but then *better and more effective writing* would negate that. Just a thought ....

  • @chengalvalavenkata2401
    @chengalvalavenkata2401 3 месяца назад

    If you can upload a report with all your test runs (Claude vs GPT-4) that would be great. :)

  • @seventyfive7597
    @seventyfive7597 4 месяца назад +1

    So you tested if Claude team fine tuned their model to the snake question, and that's nice that the people there are aware of repeated tests, but how about really testing it for code?

    • @SkillLeapAI
      @SkillLeapAI  4 месяца назад

      I’m not a developer but if you have recommendation I can test out, I’m happy to try

    • @seventyfive7597
      @seventyfive7597 3 месяца назад

      @@SkillLeapAI Just ask it to perform any task you'd like it to make, ask it to create a different game that is at a similar complexity to snake, as long as the question has not been asked in the past, you're good to go.

  • @sfinford
    @sfinford 4 месяца назад +1

    YOOOO

  • @Futurist_05
    @Futurist_05 3 месяца назад

    Whoi is actually testing each version of ai models when they release it ro rhw public? I mean tge comparison table? Is there any regulation?

    • @SkillLeapAI
      @SkillLeapAI  3 месяца назад

      As far as I know, those are internal benchmark testing they run.

  • @timooothy1234
    @timooothy1234 4 месяца назад

    I'll let this marinate for some some weeks or months for it to be better trained by users input

  • @thaholylemon43
    @thaholylemon43 3 месяца назад

    I Am sticking with chatgpt as as soon as gpt 5 comes out there will be no competition.

  • @qu_entin
    @qu_entin 4 месяца назад

    at least Gemini Advanced gave me a 2 months free trial (and I am mind blown compared to GPT 4 and will switch in case OpenAI is not able to adapt) .. Asked Claude (free) a question and return was something "I'm too busy, please try pro version" .. thank you, but this is not the way to generate new customers.

  • @tuvichuanhangngay
    @tuvichuanhangngay 4 месяца назад

    GPT-4 cũng gặp sự cố và có thể bị treo, nhưng không gây ra tình trạng như máy chủ của Claude, mất khoảng 10-15 phút để trả lời một câu hỏi. Người ta mong ước rằng họ sẽ có máy chủ như của Google, Gemini, luôn hoạt động nhanh chóng. Video có thể đã phóng đại khả năng của Claude 3 với những tuyên bố mạnh mẽ về sự vượt trội so với đối thủ ở mọi lĩnh vực. Tuy nhiên, Anthropic's Model có thể thể hiện điểm mạnh ở một số lĩnh vực, nhưng các mô hình ngôn ngữ lớn rất phức tạp và hiếm khi có sự thống trị hoàn toàn. Một bài thuyết trình cân nhắc hơn sẽ tập trung vào các điểm mạnh cụ thể mà Claude 3 có, so sánh với nhược điểm và thừa nhận rằng hiệu suất có thể thay đổi theo nhiệm vụ. Quan trọng là phải chờ đợi xác minh độc lập về những tuyên bố này, vì các công ty có thể thiên vị sản phẩm của mình, gây nghi ngờ về những tuyên bố quá mức.

  • @RobloxInsanity
    @RobloxInsanity 4 месяца назад

    might keep my subscription if Claude is even better now. I actually use it mainly for helping me write my books and game coding and few other very small things.
    when it came to my book writing chat gpt did it better sometimes like helping me expand a paragraph of story text of story telling like add more detail into what i already typed.

  • @konrad3
    @konrad3 4 месяца назад +2

    Meanwhile in the European Union Claude is still not available...
    And you'll need a Phone Number to verify your country

    • @qu_entin
      @qu_entin 4 месяца назад

      It works with US VPN and they sent me a code to my EU Number; the verification worked .. but I do not know if you are able to purchase the Pro Plan eventually .. did not try

  • @phen-themoogle7651
    @phen-themoogle7651 4 месяца назад +1

    Claude3 is awesome but servers are💀.... now that everyone is there lol
    And GPt4 also was having issues with them and would freeze a lot, but not as crazy as Claude's servers, takes 10-15 mins for one reply now. I wish they had Googles Servers, Gemini is always ultra fast..

  • @TheHistoryCode125
    @TheHistoryCode125 4 месяца назад +3

    The video likely overhypes Claude 3's capabilities with its bold claim of outperforming competitors in every category. While Anthropic's model may show strengths in certain areas, large language models (LLMs) are complex, and outright dominance is rare. A more balanced presentation would highlight specific benchmarks where Claude 3 excels, compare its weaknesses, and acknowledge that performance can vary depending on the task. Additionally, it's important to await independent verification of these claims, as companies can be biased towards their own products, making skepticism towards sweeping statements advisable.

    • @SkillLeapAI
      @SkillLeapAI  4 месяца назад +5

      well looks like ChatGPT is bad at commenting on RUclips videos. Not at all what the video is.

  • @adhumon55
    @adhumon55 4 месяца назад +2

    Not impressed, Claude 3.0 models sounds more like gpt than sounding human like 2.1,2.0 did! Very sad that they destroyed the strength of claude

  • @alejandrones5238
    @alejandrones5238 4 месяца назад

    First ❤

  • @Apokalupsis88
    @Apokalupsis88 4 месяца назад

    Except it's not tue. In an actual head to head vs GPT 4, it was shown to be a bit inferior: ruclips.net/video/sX8Ri3w2MeM/видео.html&ab

    • @SkillLeapAI
      @SkillLeapAI  4 месяца назад

      well everyone has had it for like 4 hours. So really can't make a real determination.

    • @SkillLeapAI
      @SkillLeapAI  4 месяца назад

      Also Matt's video has the same title which is the claim of Claude and after watching it, doesn't sound like he came to a conclusive answer either.

    • @Apokalupsis88
      @Apokalupsis88 4 месяца назад

      @@SkillLeapAI Right, but the claim is in quotation marks, indicating that it's just the claim and not necessarily reality. Matt's conclusion is that Claud didn't beat out gpt4 and is more expensive. He does point out that gpt won out in logic and dialog use but Claude did very well in the technical portion (centipede game)

    • @SkillLeapAI
      @SkillLeapAI  4 месяца назад +1

      I see. For some reason all his titles say shocking or breaking lately and I can’t keep track. On the consumer side, they are both $20 dollars a month, and I usually compare the consumer facing Chatbot and not the API. But I understand the point. I just don’t think any of us can have any claim of our own with a couple of hours of testing. I do remember Gemini had similar claims and I ended up disagreeing with every benchmark. So we will see

    • @SkillLeapAI
      @SkillLeapAI  4 месяца назад

      I added quotes too so it’s clear it’s their claim and not mine.