Can ChatGPT solve the world's hardest puzzles?

Поделиться
HTML-код
  • Опубликовано: 21 ноя 2024

Комментарии • 84

  • @sanjey-ww8jn
    @sanjey-ww8jn Год назад +201

    This guy talks to ChatGPT exactly the same way interviewers do during my technical interviews xd

  • @nurichbinreel4782
    @nurichbinreel4782 Год назад +53

    You didnt even have to go this hard. Ask GPT 4 to solve a simple ceasar cipher. You can even tell it the exact letter shift and it will still fail to apply it.

    • @EvanG529
      @EvanG529 9 месяцев назад +3

      ChatGPT doesn't do well with the actual details of a string of text. It can't really store information you give it, it just knows what word probably comes next.

  • @Viniter
    @Viniter 10 месяцев назад +6

    It produced text that looks convincingly like an answer to a logic puzzle, which is exactly what it's trained to do, so 10/10.

  • @makesnosense6304
    @makesnosense6304 Год назад +18

    What's funny about all these "Coded an entire website using ChatGPT" is that 1. It's not really an entire website. It's just basic stuff. Unless you count some one page site with some simpler functionality and buttons an "entire website"... And 2. There were plenty of corrections before ending up with whatever was made.

  • @IHaventDiedYet
    @IHaventDiedYet Год назад +71

    Thought this was like a 50k subs channel, only 148? Greatly underrated

  • @FAB1150
    @FAB1150 Год назад +27

    4:09 FIY it didn't bug out, it ran out of tokens for the answer. You can tell it "continue" or "go on", and it will go on with the answer!

    • @xBINARYGODx
      @xBINARYGODx Год назад +2

      yes, this and other things make me think he doesn't understand language models and their limitations too.

  • @sandoh9500
    @sandoh9500 Год назад +39

    This channel is destined for big stuff

    • @Jake28
      @Jake28 Год назад

      you're destined for big stuff

  • @colouredmirrorball
    @colouredmirrorball Год назад +1

    I was trying to concentrate on the puzzles, but I kept getting distracted by THE LICC

  • @Xeverous
    @Xeverous Год назад +6

    ChatGPT can't solve anything because it doesn't understand the meaing of words. All it does is pattern matching and probability models. The answers to simple puzzles come out probably just because the training input already had them and the bot correlated these answers with the questions in the input.

  • @mcwolfbeast
    @mcwolfbeast Год назад +17

    ChatGPT is a language model based thing. Don't expect it to understand problems that fall outside of the scope of basic logic and language comparison.

    • @orterves
      @orterves Год назад +7

      This absolutely. I wonder though if a similar model trained purely on mathematics would have better success with maths problems?

    • @SgtSupaman
      @SgtSupaman Год назад +6

      So, it failing to provide a word with 11 letters has nothing to do with language?

  • @kipchickensout
    @kipchickensout Год назад +6

    It's only good for stuff that do not require too much thinking or calculations, if I told it to give me certain logic in code, it was only able to do it for very common things such as a levenshtein distance alg, but not for something lesser known

    • @orterves
      @orterves Год назад

      That's because it doesn't do any thinking or calculations. It's a word predictor. It predicts words.
      Don't hammer your nails with a sponge. Don't run mathematical calculations with a word predictor.

  • @orterves
    @orterves Год назад +13

    ChatGPT is a word predictor with a bias towards attempting to match the current conversation context.
    The more you try to correct it in a conversation, the more tied up in the context it gets.
    Don't correct the bot with further conversation, edit your statements or restart the conversation entirely

    • @charleystello1822
      @charleystello1822 Год назад

      Yes! While these puzzles are really difficult and I’m not so sure it would have gotten them anyway, the way he was talking to gpt was not exactly “correct” as you said it relies heavily on context and while it is possible to correct simple mistakes when it comes to difficult tasks, correcting it does more harm than good because he starts to get confused about what is fact and what is not based on what both he said previous and what the user has inputted, the majority of the hallucinations that I have witnessed with gpt comes from trying to correct him without doing it in the correct way if that makes sense

  • @herzogsbuick
    @herzogsbuick Год назад

    that LifeAdviceLamp "Buy Lottery Tickets" tweet, is the king of the city of my heart

  • @KidJV
    @KidJV Год назад +1

    underrated channel is underrated

  • @GeoRoze
    @GeoRoze Год назад +5

    If anyone wants some peak humour:
    Ask chatgpt to draw an ascii art of yoshi.
    You’ll be surprised at what you find

  • @perelmanych
    @perelmanych Год назад +12

    I asked if ChatGPT knows Bulls and Cows game and suggested to play it. Bot thought a number and I had to guess it. After the third answer limitations of a bot that just tries "to continue a sequence of words with the most probable candidate" became very obvious)) Answers were inconsistent and when I pointed out inconsistency it agreed about mistake, but the new answer it gave was as inconsistent as previous. To sum up, when ChatGPT saw something similar to a problem in a training set as I believe was the case for the single-cross problem it can produce wonderful results, but do not expect real reasoning from it.

  • @xrayian
    @xrayian Год назад +1

    Kinda feeling happy for being a subscriber before 1k, you'll do great if you keep at it!

  • @asdfssdfghgdfy5940
    @asdfssdfghgdfy5940 Год назад +1

    Man I would die on the hill about seed being an early stage of a plant.

  • @IcecubeX
    @IcecubeX Год назад +7

    this is really cool

    • @kevinfaang
      @kevinfaang  Год назад +7

      you're pretty cool yourself, IcecubeX 2000

  • @lancemarchetti8673
    @lancemarchetti8673 Год назад +2

    I created a simple stenography challenge for ChatGPT, which only required 5 steps to uncover my pseudo Google account details. The bot could not solve it. I used no password encryption, only standard ASCII reversal , encoded to binary, then to Base64. I then advanced every 3rd character in the Base64 by 1. I then added the resulting string into metadata in a standard jpeg file depicting a red rose. It would have been cool if the AI could have uncovered the hidden data. Perhaps we still have far to go before AI can achieve this. ?

    • @samuelthecamel
      @samuelthecamel Год назад +1

      ChatGPT is really just a fancy next-word predictor, so it can't really do stuff like that and probably never will for a long time. It's like trying to use a fork to eat soup. That being said, if there's an AI that is specially trained for this task, it may be able to recover your account details.

    • @lancemarchetti8673
      @lancemarchetti8673 Год назад

      @@samuelthecamel Agreed

  • @ryanm2648
    @ryanm2648 Год назад +2

    The issue is that you found these tests online. ChatGPT has scanned the internet so it can get many of the word riddles. Some of them it just doesn't know what you're asking.

    • @AgentFire0
      @AgentFire0 Год назад +3

      Yup, I've suspected as much when I copy-pasted an Einstein's famous riddle to the ChatGPT, and it immediately blurted out the right answer, however, when I simply replaced "German guy" with a "Russian guy" within the puzzle's description, ChatGPT fucking exploded with wrong answers, illusions of logical thinking, and other nonsense stuff.
      So, in the end, it couldn't even compare my input with the IDENTICAL input it was learned upon save for ONE replaced word.

    • @ryanm2648
      @ryanm2648 Год назад

      @@AgentFire0 I have managed to actually get it to do problem solving by making word riddles that are not anywhere on the internet.
      I asked it something like this.
      There are three boxes, box A, B, and C, all placed side by side. These all look identical.
      Fred places a coin in box A for storage, and leaves the room.
      While Fred is not anywhere nearby, box A is switched places with box C.
      The coin is removed, and placed in box B.
      When Fred returns, he looks in the storage where he placed his coin, which box will Fred check first?
      It got this right for me.
      And then as another test, you could say the boxes are labelled BUT you need a way to ask it "Which position does he check" rather than "Which box does he check". Because, if they are labelled, he will see it has switched places, and check the third position (where box C was) BUT this is still technically box A. So even though the answer is the same if they're labelled (He will check box A) the place where box A is has changed.

  • @sharpieman2035
    @sharpieman2035 10 месяцев назад +1

    Are you the Kevin Fang that works at Jane Street or is that a different Kevin Fang? He’s on LinkedIn if you’re not him and want to find him.

  • @I_Was_Named_This_Way...
    @I_Was_Named_This_Way... Год назад +7

    You are very underrated ):

  • @Ceidonianphysicist
    @Ceidonianphysicist Год назад +3

    I asked it to play tic tac toe with me. A game famous for always ending in a draw, been solved by computers since the 50s and famously used in the 80s film Wargames to teach the rogue ai in that film about no win scenarios. I won every game against chatgpt. It’s a very clear word prediction algorithm but intelligence it is not.

  • @Ramonatho
    @Ramonatho Год назад

    Hitting the AI with "bruh" is what's gonna lead to the robot uprising isn't it

  • @NuncNuncNuncNunc
    @NuncNuncNuncNunc Год назад +6

    A one story house with a basement is still considered one story. Only above grade level counts. Edge cases are everywhere.
    Tug of war: You might be able to convince ChatGPT that the correct answer is incorrect.

    • @Alex_Vir
      @Alex_Vir Год назад

      Also is a roof balcony counted as an additional story?

    • @NuncNuncNuncNunc
      @NuncNuncNuncNunc Год назад

      @@Alex_Vir Balcony/deck is not a habitable space, not even inside the house, so no. Inside there could be stairs up to a roof deck in a one story house.

  • @ai-spacedestructor
    @ai-spacedestructor Год назад

    im not surprised by this. puzzle solving probably was a fairly low percentage of the training data and since it cant access the internet its not able to learn or look up how it works and therefore is just randomly guessing like probably most people would.

  • @JohnDlugosz
    @JohnDlugosz Год назад +20

    After seeing so many reports of astonishing things ChatGPT can do, it appears the worm has turned and now we find it interesting where it fails.

    • @fergalhennessy775
      @fergalhennessy775 Год назад +15

      it's an nlp language model, not the oracle of delphi, i wouldn't be concerned if your job requires brain power.

    • @television9233
      @television9233 Год назад

      Not really, seeing the difficulty of the puzzles I would be surprised if it got any of the reasoning correctly. (although I did suspect it would have seen at least some of those answers on the internet previously but I guess not)

  • @lightning_11
    @lightning_11 Год назад +1

    7:08 all that math looks impossible to me.

  • @CC21200
    @CC21200 3 месяца назад

    "Difficulty" of a puzzle (to humans) is not a relevant metric here. Rather, it's more a matter of how well-documented the solutions are in its training dataset.

  • @SunnyNagam
    @SunnyNagam Год назад

    Language models are horrible at letter wise questions like the first one. It basically doesnt even read letter by letter or know where they are since it turns the words into chunks and the chunks into embedding space vectors. Language models are also not really built for math since the neural networks theyre based on have no way to perform calculations outside of memorization and pattern matching.
    That being said even if these two limitations didnt exist it would probably still get the questions wrong since ai just isn't there yet to do this level of multi level creative reasoning... Yet.
    Id be curious to see the results with gpt-4 and "chain of thought" prompting, as im sure that would perform much better.

  • @pal181
    @pal181 Год назад +1

    I once tried something like this and it did same crap. Now I wonder how many hours they spent to get those ad results.

  • @Ikxi
    @Ikxi Год назад +1

    I gave up on chatgpt when it just kept giving me the same code over and over again.
    It's so painful.

  • @alepouna
    @alepouna Год назад +2

    Would be fun to see this revisited with chat gpt 4

  • @mikesum32
    @mikesum32 Год назад

    Puzzle one sounds like the ABCs song.

  • @danraine9009
    @danraine9009 Год назад

    bruh is my most used comment back to chat-gpt's answers hahaha had me laughing there

  • @TMinusRecords
    @TMinusRecords Год назад +1

    Funny how the language model is terrible at the language puzzle, but great at the maths one

  • @Paulo27
    @Paulo27 Год назад

    I'm actually crying, this was hilarious.

  • @codewizard58
    @codewizard58 Год назад

    chatgpt is a very chatty talking dice.

  • @djmips
    @djmips Год назад +1

    Any improvement with ChatGPT4?

    • @kevinfaang
      @kevinfaang  Год назад +1

      Not sure (not going to sign up for premium). I think Bing AI uses GPT4 though...

    • @jean-lucsedits4319
      @jean-lucsedits4319 Год назад

      I have been using bing for a while now to search things if a very obscure language. And to be very honest, it's good to provide general answers but really bad at giving very precise result, in that case Google actually beats it. Also I hate that it doesn't open youtube videos on RUclips. In conclusion I don't really see much of an improvement atm :)

  • @danieltao261
    @danieltao261 Год назад

    Can you add what bgm you used to the description?

    • @kevinfaang
      @kevinfaang  Год назад

      All original music in this one - the intro one is on this channel (the davie504 video)

  • @maxmustermann8447
    @maxmustermann8447 Год назад

    Duuude, you new, you good! please keep it up! :D
    Sup from me

  • @samuelthecamel
    @samuelthecamel Год назад

    ChatGPT can't actually read letters. Instead, words are simplified into "tokens," which may be a full word or a part of a word. This puts it at a severe disadvantage with any word puzzles.

  • @herzogsbuick
    @herzogsbuick Год назад

    You have truly doubled Dolly with extra care.

  • @natew4724
    @natew4724 Год назад

    Answer: No, but it sure thinks it can.

  • @babywaffles
    @babywaffles Год назад

    Try GPT-4

  • @adre2194
    @adre2194 6 месяцев назад

    Language models are impressively bad with anything rekated to math. I once gave one a string and asked it to count the characters and it failed in the most spectacularly impressive ways.

  • @polygonalcube
    @polygonalcube Год назад

    I'd give it a score of 1/2.

  • @Veptis
    @Veptis Год назад

    The model sees token ids, not words or letters

  • @EvanBear
    @EvanBear Год назад

    ChatGPT doesn't actually understand or analyze anything, it just makes shit up. Its main goal is to "sound" true, whether or not it's actually true doesn't matter.

  • @1234567qwerification
    @1234567qwerification Год назад +1

    The Python code is cringe.

  • @Batman_akzo
    @Batman_akzo Год назад

    It can't answer simple questions and you're talking about jane street puzzles

  • @fosy6991
    @fosy6991 Год назад

    i was here before he got big.

  • @atmavighyan6710
    @atmavighyan6710 Год назад

    Worth trying again with v4

  • @davronsherbaev9133
    @davronsherbaev9133 Год назад

    important note: you started with gpt4 and continuted with chat gpt. Next time try to use gpt4, its much smarter)

  • @monoco1159
    @monoco1159 Год назад

    Prompt engineering is an actual skill, sir. You are not leveraging the complete potential of cGPT with your prompts. Retry this again but this time craft the prompts in instructions format. Look at its training for reference.

  • @aze4308
    @aze4308 Год назад

    yoo

  • @SgtSupaman
    @SgtSupaman Год назад

    Bot solves 0/3 problems, scores 2/5... Sure, ok.