OpenAI o1's New "Paradigm" Test-Time Compute Explained

Поделиться
HTML-код
  • Опубликовано: 5 ноя 2024

Комментарии • 144

  • @bycloudAI
    @bycloudAI  22 дня назад +28

    Let me know if you guys want a dive into the methodologies of TTC, there's a lot of new papers/implementations coming out every day lol (entropix is a cool one)
    Check out NVIDIA's suite of Training and Certification here:
    [NVIDIA Certification] nvda.ws/3XxkFyj
    [AI Learning Essential] nvda.ws/4gvD474
    [Gen AI/LLM Learning Path] nvda.ws/4enwYE7
    You can use the code “BYCLOUD” at checkout for 10% off!

    • @Pastellsdj
      @Pastellsdj 22 дня назад

      Entropix video would be appreciated. Keep up the great work!

    • @broformation6530
      @broformation6530 22 дня назад

      Please cover entropix

    • @juliandarley
      @juliandarley 21 день назад

      thanks for the video. i would like to see more on methodologies of TTC.

    • @sirius-harry
      @sirius-harry 21 день назад +1

      Yes, please.

  • @lbgstzockt8493
    @lbgstzockt8493 22 дня назад +67

    OpenAI went from extremely secretive closed-source for profit to even more secretive closed-source for profit. Truly revolutionary change.

  • @rawallon
    @rawallon 22 дня назад +93

    Your channel is like twitter but only the good part, I love it

  • @Guedez1
    @Guedez1 22 дня назад +87

    One of the chain of thoughts felt like doing an A* search on all possible answers

    • @TheRyulord
      @TheRyulord 22 дня назад +6

      you might be thinking of "tree of thought"

    • @polyhistorphilomath
      @polyhistorphilomath 21 день назад +3

      @@TheRyulord In soviet Russia, tree of thought thinks of you. It's quite considerate.

    • @herp_derpingson
      @herp_derpingson 19 дней назад

      Add a replace the heuristic function with value function from reinforcement learning and you get Q*

  • @XetXetable
    @XetXetable 22 дня назад +46

    I don't understand why you're so insistent that using RL to learn reasoning can't cause new knowledge to be gained. You're implicitly assuming that if the model knows A and that A implies B then the model must already know B. But that's not true. The model knows the rules of chess, and these rules imply whatever the optimal strategy is, but it definitely doesn't know this optimal strategy. It may come to learn it (or of approximations of it) through RL, though, as alpha zero and similar did.

    • @tukib_
      @tukib_ 21 день назад +5

      Yeah, and deductive reasoning isn't the only form of reasoning. If anything, abductive and inductive reasoning are used a lot more than deduction in human cognition. So even without CoT, search methods are incredibly useful here and were key ideas of Cyc and Watson.

    • @WoolyCow
      @WoolyCow 21 день назад +5

      will be interesting to see how this plays out, im still split about this tbh...
      i think the difference between chess and reasoning is the 'why'. in chess, there is no 'correct theory', over time the model will either get better, or it wont. It doesnt matter how the bot ranks the importance of pieces or aims to control the board, we just care about the result, winning. when we evaluate a model based on the outcome, it may very well 'reason' but it does it in such a weird and wonderful way that we just cant relate to.
      but things break down when instead of evaluating a model based on the outcome, we evaluate it based on the process. in this case, the process is the steps of reasoning that are taken to get from a to b. the very reason why nns are so powerful, the fact that they 'think' in completely different ways to us, is exactly what makes it difficult for them to conform to a very specific set of human prescribed ways of thinking. it forces a narrow range of 'correct' ways to think onto a bot that prefers to find its own way optimal way. it cant learn its own reasoning, because our evaluation will penalise it every time it tries to be creative.
      so, this leaves two possibilities. it either:
      1. learns to conform to our definition of reason
      2. it can't, and just does its own thing
      i think the problem is as follows (take this with a grain of salt im not an expert):
      when the models are trained, they mostly learn how to learn however they want, there isnt a prescribed way of thinking forced onto them. this results in them thinking in weird and wonderful ways that likely have no congruence with what we consider 'correct logic'.
      so, come time to finetune reasoning into them, or get them to start doing CoT, they may have learnt how to imitate correct reasoning steps, but deep down they are still doing what they always did, the weird and wonderful way they always did.
      this training paradigm will be unlikely to truly embed the 'correct reasoning process' into models, as by their nature, they create their own way to reason. either we need more synthetic data to encourage correct reasoning in all training data, or a new hybrid approach is needed that blends the best of everything we've got, and can accurately instruct the model to make correct logical assumptions

  • @Terenfear
    @Terenfear 22 дня назад +24

    Glad to see the original editing approach back.

    • @fnytnqsladcgqlefzcqxlzlcgj9220
      @fnytnqsladcgqlefzcqxlzlcgj9220 22 дня назад +3

      Yeah this is like 2x slower I can actually watch it, his videos were getting faster and faster to the point where it was just dopamine noise

  • @cdkw2
    @cdkw2 22 дня назад +38

    9:53 rare anger bycloud moment

    • @literailly
      @literailly 22 дня назад

      😂

    • @cdkw2
      @cdkw2 22 дня назад

      @@literailly its fun lmao

  • @BloomDevelop
    @BloomDevelop 22 дня назад +10

    Fun fact: I have spent 3-4 days trying to fix a single SQLite bug while I was debugging with AI

    • @arcturuslight_
      @arcturuslight_ 22 дня назад +4

      cute pfp. very pettable boyo

    • @3letterword
      @3letterword 22 дня назад

      I agree with arc

    • @kowaihana
      @kowaihana 22 дня назад

      that's why you must learn to read errors

    • @oguzhan.yilmaz
      @oguzhan.yilmaz 21 день назад +1

      ​@@kowaihana Or know how to program in sql

    • @kowaihana
      @kowaihana 21 день назад

      @@oguzhan.yilmaz i know some people who know how to code but not know how to read basic errors

  • @AidanNaut0
    @AidanNaut0 22 дня назад +16

    so basically they found out that giving the layman a bit more time to solve an easier problem can be more cost effective thst giving the smart guy a menial task, and it is also worth giving the smart guy more time to train to more effectively solve harder problems...
    havent we already known this for hundreds if not thousands of years?

    • @leoym1803
      @leoym1803 21 день назад

      You're right, we have. That's why you're out there training and becoming the best you could ever be, instead of writing things we already know for thousands of years, right?

    • @AidanNaut0
      @AidanNaut0 21 день назад

      @@leoym1803 just reiterating to learn. as the proomters say, "read it again"

  • @shApYT
    @shApYT 22 дня назад +15

    RLHF or in other words LGTM ship it to prod.

  • @GIRcode
    @GIRcode 22 дня назад +8

    kinda reminds me of how chess bots like stockfish are able to view multiple potential outcomes to find the best move possible

  • @4.0.4
    @4.0.4 22 дня назад +4

    I just hope this kick starts inference backends like ollama, kobold, ooba, tabby or any other into having native support for any test-time compute approaches. Would be nice to query some fast small model like a 12B Mistral and get it to take longer but think through a better answer.

  • @John_YT
    @John_YT 22 дня назад +9

    "Bart say the line!"
    *Sigh* "The bitter lesson strikes again"

  • @H0mework
    @H0mework 22 дня назад +3

    Thanks! Very interesting about eng not improving.

  • @PieroSavastano
    @PieroSavastano 22 дня назад +2

    Totally agree, mid. Deep mind already did the most on this

  • @Originalimoc
    @Originalimoc 22 дня назад +3

    Okay this explains why higher temp and top_p give better results sometime😮

  • @acters124
    @acters124 21 день назад +2

    Also what is interesting about silly things like counting the amount of r in strawberry, it can easily be done if you instead start the AI with something more solid to work with, such as telling it to use the code interpreter/generation capabilities. which means 4o right now can technically count r better than o1 because it can run simple python code. This is the difference between running a nondeterministic model vs asking it to instead leverage a tool specifically made to be completely deterministic. 4o being able to use something like code generation and interpreter is more massive use than o1 can do with its limited capabilities. instead, openai will need to implement tools for o1 to interact with that can give more solid deterministic outcomes. so that when o1 does the chain of thought, it can simply think, hey I am unsure let me query a tool that can output something reliably or touch on a verifiable database of information.

  • @Words-.
    @Words-. 22 дня назад

    Really nice stuff, the most informative take I've seen so far on the o1 models, thank you!

  • @johnmcclane4430
    @johnmcclane4430 22 дня назад +8

    Do the studies that compare 01 vs gpt4 utilize a chain of thought prompt for the latter because if not the discrepancy in performance seems arbitrary.

    • @avraham4497
      @avraham4497 22 дня назад

      They didn’t and they shouldn’t have

    • @johnmcclane4430
      @johnmcclane4430 22 дня назад +1

      @@avraham4497 You'll have to explain why. Having COT baked in from training doesn't tell anyone if the model is strictly better at reasoning than another model given a COT prompt.

    • @avraham4497
      @avraham4497 22 дня назад

      @@johnmcclane4430 When you test the reasoning of humans with exams, do you try to prompt engineer your question to maximize the performance of the ones being tested? Or do simply write the question in a clear as possible manner? The answer is the latter option, and the same should be true for testing AI systems.
      Your second sentence is true about the underlying LLMs, but not about the models as wholes; if you add COT to a model, it becomes a different model, and it shouldn’t be looked at as the same model. You are telling me that an AI scientist from 10 years ago couldn’t compare the reasoning abilities of gpt 4o and open AI o1 if they were given to him as black boxes without any explanation as to how they work?

    • @johnmcclane4430
      @johnmcclane4430 21 день назад

      @@avraham4497 Humans are the ones with the reasoning skills trained in you dunderhead. As for your scientist question, I sure hope that anyone who does actual research quickly realises that they don't need to confine their AI to a singular line of questioning. Seriously, did you think any part of this through before you made your comment?

    • @avraham4497
      @avraham4497 21 день назад +1

      @@johnmcclane4430 Your response makes no sense to me

  • @szebike
    @szebike 10 дней назад

    From what I see they do the same with answers what a LLM does with tokens in o1 , so predict the most likely answer in the chain of thought like they were tokens in a sentence. The hallucination problem still is prevalent and essentially we could have new hallucination types involved eventually (the whack a mole situation is still strong) . Also this makes the whole cost per answer increase way more so essentially you run multiple GPTs at once (hence the highe rprice). As you excellently showed it has some crucial flaws like small "copying errors" in those chains [strawberrrry]. So if we are somewhat sceptical one could say its futile because the Apple paper is correct and LLMs can't reason and by breaking down complex tasks into subtasks only the probability is higher to have a pattern match in that smaller context and there is still no proper reasoning going on ( based on logic rather on raw pretrained statistics - the performance drop even in o1 when switching some small parameters in the benchmark questions hints to that). My problem especially with Openai is the insane (ungrounded) valuation which creates insane pressure to perform and thus not only destroys their working culture but also the honesty about what works and what not. if you have the incentive to always announce "GPT6AGI next month confirmed" and stir up artificial hype to get more cash than your burnrate you will stifle any scientific progress. I think thats why Claude has more progress because they have more "ease of mind" while developing in their team. In my opinion Meta, Openai and Google aswell as Anthropic etc. would fare way better if they would work on one big closed model intended for scientific progress and not a product while giving the comunity the chance to improve upon it since its a global effort towards safe AI [ yep too late for that one ]. As fun these local models are the only true use I see at the moment is manipulative AI slop everyhwere (either to manipulate opinions or trying to gain a quick and dirty buck or two for low effort). The only benefit I see is that it raised the AI research field into the public awareness but the overheated bubble behavior will do some harm we will see whats left once the dust settled.

  • @sgttomas
    @sgttomas 22 дня назад

    love these paper summaries. thank you 🙏 🎊

  • @TawnyE
    @TawnyE 11 дней назад

    squad mentioned!!!!!!!
    The game squad

  • @amantayal1897
    @amantayal1897 22 дня назад +1

    now i think performance increase of o1 models are only because of new knowledge added during this CoT based RL training. Also training data will be mostly comprised of maths and coding problems as it's much easier to create CoT based examples for them which reflects thr performance increase only in these categories.

  • @acters124
    @acters124 21 день назад +1

    I most definitely got o1 talking to itself for MORE than 60 seconds, but does seem to hit 59 seconds most of the time when given complex or longer tasks.

  • @shodanxx
    @shodanxx 22 дня назад

    I really resonate with you as a human during your mini meltdown at 9:00

  • @KeedsLinguistics
    @KeedsLinguistics 21 день назад

    Great video, keep up the good work.

  • @malchemis
    @malchemis 21 день назад +7

    the strawberry test is hard because the word gets encoded to : [302, 1618, 19772] == [st, raw, berry] or something similar. The model doesn't reason in letters but in tokens which removes some of the information necessary to, for example, count the number of letters.

    • @mirek190
      @mirek190 21 день назад

      People also do not thinking in letters .
      Latest research proved that human neuron is storing the whole word like llm does.

    • @ZintomV1
      @ZintomV1 21 день назад +2

      ​@@mirek190The difference is, humans can then spell out the word letter by letter which a model will not internally do, unless you use a CoT

    • @mirek190
      @mirek190 21 день назад

      @@ZintomV1 Using COT shows only llm can do that bur are lazy ;) .
      Also most words llm can spell easily "strawberry" is one of the exceptions.
      Most llms currently using as thinking of stage 1 easy thinking without looping problems which is thinking stage 2.
      Next generation llms probably will be learn this way from the background so even word like "strawberry" will be easy as llm use stage 2 thinking.

  • @DoktorUde
    @DoktorUde 22 дня назад +3

    The Google paper on test-time compute evaluated how Process Reward Model or Self-revision performed as it scaled. Given OpenAI's approach of training on millions of synthetic reasoning chains, you can't simply use this paper to claim it doesn't scale as OpenAI described, since it involves a very different approach in what the model does post-training. At least as far as I understand.

  • @ChristophBackhaus
    @ChristophBackhaus 22 дня назад +9

    Again... What is up with the bad prompting.
    o1 mini getting 100% of the 20x20 Table simply by using:
    # Goal
    Create a comprehensive multiplication table for numbers 1 through 20, including detailed calculations for each product.
    # Method
    1. For each pair of numbers from 1x1 to 20x20:
    a. Calculate the product
    b. For products involving numbers 11-20, show the step-by-step calculation
    2. Format the calculations as follows:
    - For simple products (1-10):
    5x7 = 35
    - For products involving 11-20:
    13x20 = 10x20 + 3x20
    10x20 = 200
    3x20 = 60
    200 + 60 = 260
    3. Create a grid displaying all products from 1x1 to 20x20
    # Additional Notes
    - Be thorough and show all steps for products involving numbers 11-20
    - Ensure accuracy in all calculations
    - Present the information in a clear, organized manner
    - The grid should have both horizontal and vertical headers from 1 to 20
    # Example Output Snippet
    1x1 = 1
    1x2 = 2
    ...
    13x20 = 10x20 + 3x20
    10x20 = 200
    3x20 = 60
    200 + 60 = 260
    ...
    20x20 = 400
    [Include the grid here]
    By following this method, you'll create a detailed and educational multiplication table that shows not just the results, but also the process of calculating more complex products.
    Took me not even 5 minutes to come up with this prompt and no reprompting or anything needed. This works zeroshot

    • @float32
      @float32 22 дня назад +2

      I think 5 minutes of human time is more expensive than 5 minutes of computer.

    • @BHBalast
      @BHBalast 22 дня назад

      Maybe I don't get the assigment, but I tested it on llama 8B with generic system assistant prompt and "# Goal
      Create a comprehensive multiplication table for numbers 1 through 20, including detailed calculations for each product." user prompt and it did just fine.

    • @crimsonkim6824
      @crimsonkim6824 22 дня назад +4

      the problem of further prompt engineering is that it is not generalizable. you don't want to be thinking about additional prompts for each and every new unique problems/tasks you want to solve.

    • @rolandgao5894
      @rolandgao5894 21 день назад +7

      They are talking about 20 digits multiplied by 20 digits. You are talking about 2 digits by 2 digits, which the graph in the video shows the model can do.

    • @janek4913
      @janek4913 21 день назад

      Dude. Not 20x20. 20 DIGITS. As in 95800904631026778660 x 25684705875830852248
      All equations till 20x20 are guaranteed to be in the training dataset somewhere anyway..

  • @chapol8573
    @chapol8573 9 дней назад

    Where can I find these papers?!

  • @poipoi300
    @poipoi300 20 дней назад

    Thanks for this. TTC has the potential to be great, not like this though.
    Internally, the model needs to be able to execute loops to refine and transform information until it has determined it has solved the question. Using generated tokens to sort-of accomplish that is a lot of unnecessary work. It requires all thoughts to be translated from language to thoughts to language to thoughts over and over again. If we want reasoning, we will need models that can memorize DURING inference and modify that memory until a mechanism signals that the memory is in a suitable state to answer the question. Perhaps this functionality can be trained independently before being grafted on.

  • @gemstone7818
    @gemstone7818 21 день назад

    This was a very interesting analysis of the o1 model, on par with ai explained

  • @duduzilezulu5494
    @duduzilezulu5494 21 день назад +1

    10:00 went from ClosedAI to SigmaAI.

  • @randomlettersqzkebkw
    @randomlettersqzkebkw 22 дня назад +6

    wait... they got the weights?
    this was a joke right?

    • @bycloudAI
      @bycloudAI  22 дня назад +9

      ya they sent me on discord

    • @nehemiasvasquez8536
      @nehemiasvasquez8536 22 дня назад

      ​@@bycloudAISeriously?... They shared the weights?...

    • @DefaultFlame
      @DefaultFlame 22 дня назад

      @@bycloudAI . . . are they sharing with anyone else? Like, somewhere anyone can access them?

    • @voxelsilk8462
      @voxelsilk8462 22 дня назад +7

      ​@@DefaultFlame Yes, the weights are available to Talk Tuah Plus subscribers and above.

    • @DefaultFlame
      @DefaultFlame 22 дня назад

      @@voxelsilk8462 The hell is "Talk Tuah"?

  • @tvwithtiffani
    @tvwithtiffani 21 день назад

    So in my current system that uses LLM. After watching this video, I added a setTimeout that changes a bool to true after 8 seconds, and a while loop that runs inference over and over for a "thought" given the current environment state while the bool is false. so it's thinking for about 8 seconds and spits out about 4 'thoughts' in that time. After stuffing my speaker agent's context with those thoughts generated in 8 seconds it really does improve the quality of the final output. I'm just curious, did anyone catch how they calculate how long to "think" for?

  • @dakidokino
    @dakidokino 20 дней назад

    I've been waiting for OpenAI to have this since it was introduced by babyAGI to the public on twitter. Twitter is always lightyears ahead with the new beauties of AI. At first, I thought when they said their business has proprietary methods, I thought it already had chain of thought. I guess I was wrong. GPT might go to the next level with this, since it has potential BEFORE this was in their pipeline! I used to prompt the AI to correct, review, etc, to itself since I naturally assume it could be making mistakes if it does too many over coding or advanced topics.

  • @BattousaiHBr
    @BattousaiHBr 21 день назад

    a bit ironic the choice of the "for profit" person as elon musk, considering when openAI was funded he helped choose the non-profit format and left it due to disagreements about for-profit deals with microsoft.

  • @hanif72muhammad
    @hanif72muhammad 20 дней назад

    of course! yapping straight forward is never the best way to talk, you have to think first, choose your words carefully, why are those people realize that soo late?

  • @abod_bios2
    @abod_bios2 21 день назад

    0:32
    they name it strawberry because of a glitch in ai chat if you ask him how many (R) in strawberry and he will response 2 (r)

  • @hanif72muhammad
    @hanif72muhammad 20 дней назад

    I just realize, some people that have high level knowledge, tend to be overconfident with their answer. Seems it shows the same as well in these large parameters LLMs

  • @scoffpickle9655
    @scoffpickle9655 22 дня назад

    Why not make a MBRL chatbot? (Do a MCTS of it's tokens). I know it's unrelated but still food 4 thought

  • @KevinKreger
    @KevinKreger 22 дня назад +3

    The reasoning trace is not aligned, so they don't want anyone to see it for that reason. I don't think there is a secret sauce. Antrhopic beat them to the punch on the inference reasoning trace. This is catch-up to prior art.

    • @islandfireballkill
      @islandfireballkill 22 дня назад

      OpenAI explicitly states that they hide it for a competitive advantage. This was also what they said when they released their model card on GPT-4, which didn't include any training or architecture details, on why they aren't revealing their processes.

  • @michmach74
    @michmach74 22 дня назад

    Video on entropix when

  • @BloomDevelop
    @BloomDevelop 22 дня назад

    4:36 ah yes, how you are supposed to do that

  • @sFeral
    @sFeral 21 день назад

    Croatian politicians 0:07 memefied

  • @yash1152
    @yash1152 20 дней назад

    4:25 no way i am takong certificate from a co whose chef said no cder needed & ttteeet signde

  • @rign_
    @rign_ 22 дня назад

    I wouldn't trust ChatGPT or any other LLM model to write creative writing, most of the "tone" ChatGPT writes is the same across the genres I ask for.

    • @gerardo.arroyo.s
      @gerardo.arroyo.s 19 дней назад +1

      Tbh, LLMs are not capable of writing anything literary... yet. Soon they will, and humanity will be doomed

  • @MrFlexNC
    @MrFlexNC 21 день назад

    Lets see if you are right

  • @davidl.e5203
    @davidl.e5203 18 дней назад

    So its giving LLM anxiety and overthinking

  • @l.halawani
    @l.halawani 22 дня назад

    It's doing top-k to generate sentences and then top-p on sentences? xd

  • @seraphin01
    @seraphin01 22 дня назад +2

    great cover. To be honest that strawberry was just a big disappointment to me.. after all the drama at openAI I really expected a game changer.. the opposite of what we got.
    I'm sure all the others are already implementing a version of it as we speak as it doesn't seem like much of a MOAT (or they wouldn't try so hard to hide the reasoning from us)..
    Also LLMs will NEVER create a freaking cancer drug or something.. god I'm exhausted about those claims, LLMs are dumb AF when it comes to creating new things, by definition.
    So unless openAI are actually working on some form of AGI that is capable of learning and experimenting by itself like alpha fold, LLM will just be some kinda useful assistant with major flaws

  • @darthvader4899
    @darthvader4899 21 день назад

    ruclips.net/video/pi7LF-OpO6k/видео.html
    what's the music that starts here?

  • @bbok1616
    @bbok1616 22 дня назад

    Monte Carlos tree search

  • @Originalimoc
    @Originalimoc 22 дня назад +2

    14:10 😂 exp on x axis... That's log/diminishing on y axis 😂😂😂

  • @jamesgphillips91
    @jamesgphillips91 22 дня назад +1

    i am an unemployed dev who switched industries... i can build a cot system w/ langchain... I dont know why any serious software org would need this as a SaSS product when open source has had this for literally 4 years

  • @ten_cents
    @ten_cents 22 дня назад

    I really do wish oai shills would defluff their hype a bit. Makes it so apparently disingenuous

  • @FryGuy1013
    @FryGuy1013 22 дня назад

    Monte Carlos tree search lol

  • @Napert
    @Napert 21 день назад

    Why not hire experts and fact-checkers in respective fields to build a fully human-generated dataset and use that to train the model, while applying massive penalties for whatever the model gets wrong/makes up?

  • @KeedsLinguistics
    @KeedsLinguistics 21 день назад

    It’s funny how so many focus on grammar drills but forget that real fluency comes from actually using the language daily. Totally changed the game for me.

  • @PraveenKumar-bo7fw
    @PraveenKumar-bo7fw 21 день назад

    Monte carlos lmao

  • @XeTute
    @XeTute 22 дня назад

    First Comment =)

  • @SearchingForSounds
    @SearchingForSounds 22 дня назад

    Entropix for small models looks promising

  • @fergalhennessy775
    @fergalhennessy775 22 дня назад

    6666 view 6 hours ago...........

  • @panzerofthelake4460
    @panzerofthelake4460 22 дня назад

    this is indeed glorpshit

  • @arandomfox999
    @arandomfox999 21 день назад

    Lol, AI is now suffering from Dunning Kruger effect as their model grows their ego grows.
    This is now my head canon interpretation of what's happening.

    • @gerardo.arroyo.s
      @gerardo.arroyo.s 19 дней назад +1

      AI is not sentient, so it doesn't have 'ego'... yet
      Now, the owners of the AI, that's a different story

    • @arandomfox999
      @arandomfox999 19 дней назад

      @@gerardo.arroyo.s my man. That was a joke. That's why it's my head canon, my own story I like enough to pretend it's what's happened.
      Graces. People these days can't even disseminate humour from serious statements.

    • @gerardo.arroyo.s
      @gerardo.arroyo.s 15 дней назад +1

      @@arandomfox999 it was so unfunnny though

  • @KuZiMeiChuan
    @KuZiMeiChuan 21 день назад

    Backfire 重音應該放在第一個音節
    是BACKfire不是backFIRE 😊

  • @kacperogorek3958
    @kacperogorek3958 22 дня назад

    I feel like you try to prove a point that is off from the start. TTC was never intended to be a substitute for models' scale - it was to unlock a new level of quality not feasible without it, on top of a proper parameter scaling. Both DeepMind and OpenAI show similar results of generation scores being log-proportional to a TTC's budget - at least in some section of the scale. So, it's not that OpenAI made this graph up, different teams back it up. I find your statement here a bit biased and misleading :(

  • @jonclement
    @jonclement 22 дня назад +1

    Maybe their CoT method is that Q* leak? -- Great videos. My gf thinks i'm just watching that Daily Dose of Internet guy but for tech

  • @yolocrayolo1134
    @yolocrayolo1134 22 дня назад

    Seems fake and gaeh but i felt like the way forward is to tokens to yap for a whole minute before giving you an answer.
    I think we should have made specific modifiers or addons like loras but for chat gpt instead of of forcing chat gpt to be an all in one solution.

    • @tukib_
      @tukib_ 21 день назад +1

      One of the reasons GPT and other LLMs are so generalisable is precisely because of their large training corpora. Besides, OpenAI's API does allow you to run fine-tuning, presumably using some adapter because its inexpensive. And we know Azure OpenAI fine-tuning in particular uses LoRA.