New Prompt Achieves 🚀 900% Logic & Reasoning Improvement (GPT-4)

Matthew Berman

Просмотров 84 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 27 окт 2024

Комментарии • 246

@dcasarinc Год назад ⁺¹²⁵
With the rise of AI i have seen the rise as well of multiple AI channels that are just clickbait, generic or probably use AI to generate their content. Your channel is one of the few that are actually interesting and provide valuable insights without beeing overly clickbaity. Keep up the good work!
@matthew_berman Год назад ⁺¹⁷
Thank you so much, comments like this make me want to continue making content. I try to make it as valuable as possible.
@karlbooklover Год назад ⁺³
agree, took me 5s to subscribe after starting this video
@matthew_berman Год назад ⁺³
@@karlbooklover 🙏
@hata6290 Год назад ⁺¹
@@matthew_berman bot ass reply 😆
jk btw, this video was amazing and i saw another video discussing increasing accuracy and efficiency too by ai explained. i cant wait to see the progress in just another few weeks
@mattizzle81 Год назад
Yeah I'm already noticing an explosion of "Bot channels". The voices sound less computerized now but you can still tell. It's not great. I just see them as noise/garbage. It will eventually get harder to notice though.
@samuelopoku4868 Год назад ⁺⁶³
There seems to be no end to the break neck pace of development. Every week there's a breakthrough that advanced this science. Thank you
@matthew_berman Год назад ⁺⁴
Agreed. and you're welcome!
@JafuMusic Год назад ⁺³
Welcome to the singularity
@andrewferguson6901 Год назад ⁺²
It's like a musical instrument but magnitudes and magnitudes more complicated. The input space and output space are nearly infinite, the trick is getting eXACTLY the output you're looking for
@freespeech515 Год назад ⁺¹
MORE YOU FEED AI , MORE DANGEROUS MONSTER it become. At beginning it looks as INNOCENT CHILD. These Guys who make chatGPT video are helping EVIL!!!!!
@StoutProper Год назад
@@freespeech515 get a grip
@itskittyme Год назад ⁺¹³
I used a PDF plugin and fed this paper to GPT-4, then I asked GPT-4 to construct a prompt that uses the principles from this Paper.
It spit out a long prompt, and I fed it to a new GPT-4 instance together with a task of mine it always failed at.
Aaaaand it immediately applied this Tree of Thoughts method to the task I requested.
It finally succeeded.
Oh, I love this tech.
@matthew_berman Год назад ⁺⁷
Wow!!! Can you share the prompt that it gave you? If you don't mind joining my discord, I have a channel dedicated to this paper, would love to see the prompt in there.
@Grepmoney Год назад
I would love to see that prompt.
@JacksonPaulsen Год назад
That’s actually exactly what I was thinking of trying but I wasn’t sure if I wasn’t understanding something since it seems so easy. I guess it really does work, nice job!
@itskittyme Год назад ⁺¹⁴
@@Grepmoney As a Software Engineering AI agent, you are tasked with implementing a new feature in a complex software system. The feature requires non-trivial planning and search, similar to the tasks tackled by the Tree of Thoughts (ToT) framework.
1. **Thought Decomposition**: Break down the feature implementation into smaller, manageable tasks or "thoughts". These should be small enough to be manageable but big enough to evaluate their impact on the overall feature.
2. **Thought Generator**: For each task, generate different approaches or methods to accomplish it. Consider the pros and cons of each approach.
3. **Heuristic Evaluation**: Evaluate each approach based on its feasibility, efficiency, and impact on the overall feature. Consider the dependencies between tasks and how the choice of approach for one task might affect others.
4. **Search Algorithm**: Based on your evaluations, choose the most promising approach for each task. If at any point an approach proves to be less effective than anticipated, consider backtracking and choosing a different approach.
Remember to document your thought process and decisions for each task to ensure transparency and facilitate potential future modifications.
[insert a software engineering related question here]
@squishythesquid4952 Год назад ⁺²
@@itskittyme Wow, brilliant critical thinking on your end. I love it when people find excellent solutions with minimal work. You deserve more recognition for this.
@sebastiansosnowski3859 Год назад ⁺³⁹
I can't code prototypes fast enough to get past the "It's pretty cool even tho it's just one feature" stage before new research comes out that makes it obsolete and the work begins again... xD
@nosult3220 Год назад
I made a solution. It’s so slow but in a few months w 4.5 turbo it will be useful
@mcusson2 Год назад ⁺²
I feel you, new stuff comes out so fast. Good thing the open sourced community is so helpful.
@StoutProper Год назад
@@nosult3220 a solution for what?
@StoutProper Год назад
@@nosult3220 I’ve just built a prompt that does tree of thought reasoning without any coding, seems to work pretty well at first glance.
@G4gazhotmail Год назад
Same here, I have about 30 apps I have started but stopped due to something similar, literally worked on one for 6 months and had to give up due to got now being able to scan the net... Don't beat your self up there are high paid special teams working on this day to day, as an individual you can't compete these days.
@j.macjordan9779 Год назад ⁺³
The limitations of the study states that their model wouldn't be required to improve upon GPT-4 & that implementing this would be more resource intensive. However, it would generate auditable steps in plain language showing what the rationale the machine was using to jump to one tree or another. This would provide a certain modularity (theoretically) to improve the logic of LLMs. BUT! The specific mention of plain language at these steps where there would otherwise be a sort of hypervisor overseeing the movement from one step to another, ...why would they want that in plain language? To ease with the modularity...? They would say it is...but that's entirely theoretical as to whether that's actually feasible. Why else? Because of the bane of Google AI's existence -- the "Black Box" problem that's been a thorn in their side for years now that's holding back mass adoption. It's rather difficult to trace how a DNN came about its decision when something catastrophic happens; legally, that can be rather problematic. "Why'd Ted's new AI enhanced woodchipper turn itself on when he was inside the hopper cleaning the thing?!" Google AI Engineer responds "Well, it's all one giant mystery, just like life, right?!"
Unfortunately, that is not right according to consumer dollars & our legislators' opinions. I would venture a guess that this is an attempt to bring something forward to the consumer, to legislators & show them they have "fixed" or "solved" the Black Box problem. Even though they really haven't, legislator's don't have any idea how "AI" works regardless of a Black Box or Plain Language...& I mean that literally too, as I have doubts that at least a dozen of so members of Congress can even read or write; i.e., they are illiterate. But people may be more receptive to anything beyond a Black Box & the legislators would eventually cater legislation to match. As for what this would mean in Court if you were to be a victim of the still present Black Box problem inside whatever Intelligent System that has caused you harm, this would be a tremendous offloading of liability for the Corporations that are building such systems. All they have to do is convince Legislators that DNNs spitting out a coherent sentence at several different hidden nodes means there is no more Black Box problem.
@nathanielalderson9111 Год назад
Underrated comment
@chrislevy7839 Год назад ⁺¹
Great video!! You are explaining things perfectly. I like your move to the center when you drive the point. Your voice is perfect too. I'm a fan!
@matthew_berman Год назад
Thanks so much, Chris! Love getting the feedback on editing, I spend a lot of time doing it 😝
@marcfruchtman9473 Год назад ⁺³
This is really impressive. I am especially appreciative that you spent so much time explaining the different steps. Thank you for making this video. I will say this, re: one of their conclusions, "ToT might not necessary for many existing tasks that GPT-4 already (does)... ", generally speaking a human who does not "know" the answer already, should use ToT in almost all cases simply because the standard model has no real built in verification steps. At least with ToT, the model has a way of self-reflecting. It can still be wrong quite often, but the improvements are significant, and worth the time.
@matthew_berman Год назад
Thanks as always for the great comment Marc!
@StoutProper Год назад
I agree but I would argue that it depends on the importance. Sometimes good enough is fine, for example write a letter of complaint or give me a recipe using these ingredients
@marcfruchtman9473 Год назад
@@StoutProper Yes. If you know aprior that it is unimportant, absolutely. However, I don't usually spend much time asking it things that have no real importance. For example, I would consider a letter to my Boss, important enough that I would want to double check it. My chat history is loaded with, "I apologize for any confusion or errors in my previous responses." Of course, ToT won't be the holy grail but it definitely seems to improve the answer quality.
@Jirito0 Год назад ⁺¹²
Hey Matthew, I love your videos and how frequent you upload. I was surprised to see how short you cut the experimenting with the ToT git. I see a lot of people reading about ToT, but not many actually experimenting with it, which is surprising given its massive potential. Can we expect to see a deeper dive into this topic? Thanks!
@matthew_berman Год назад ⁺¹⁴
Thank you. If enough people want to see this, I will definitely make another video about it. When the author’s repo is released, it’ll be especially compelling to make another video.
@freespeech515 Год назад
MORE YOU FEED AI , MORE DANGEROUS MONSTER it become. At beginning it looks as INNOCENT CHILD. These Guys who make chatGPT video are helping EVIL!!!!!
@jpfister85 Год назад ⁺¹
@@matthew_berman It's been released since you did this video (and they included a hilarious flame of the bootleg repo you highlighted in this video, which was the only available one when you published this) so I'll vote to see a follow-up that gets this prompt system working and showcases the logic and creative writing improvements over simple input/output prompts! Larger context and planning in responses would be a game changer!
@alx8439 Год назад
Follow up video with llama2, please. Should be groundbreaking
@robynwyrick Год назад ⁺²
Yep, another excellent video. This is amazing that you (1) can keep up with this rate of developments in the science, and (2) you can get videos made with such quality and value. Thanks!
@Dex_1M Год назад ⁺⁴
so you can use tree of thought to solve complex problems, and save the correct path as cache so it's like a memory for the language model, so the next time is solves a problem that is similar to the one it has solved now it can have a reference and see that path, and think if that is the right solution.
@Candyapplebone Год назад ⁺¹
Nice to see people making videos about this. I made a video about it but it can’t hold a candle to the ones made by pros like you and Yanich
@matthew_berman Год назад ⁺¹
No need to compare, we're all sharing the exciting AI progress together!
@Jz-dx3dm Год назад
Looks like you never sleep. Good for us as you have the latest and greatest for us as a result haha. Thanks!
@jayprice8246 Год назад
My guy definitely earned my sub with this one. Next level work bro, keep it up!
@charlesd774 Год назад ⁺¹
Awesome video! the great thing about GPT is you can turn it into a "function" meaning give it some input and ask it for a structured output, like the score out of 10, which you can parse and use in normal code. then build conversation templates that go down different paths depending on the score/10. (like ask it different questions about an article based on an initial summary)
the best part is that it is so open-ended, that anyone can implement their own ToT with different wording, and they all work!
@andersberg756 Год назад ⁺¹
about LLM as a function: there's a podcast show titled "The last mile of AI development" whích gives input around this, especially problems, solutions for making these solutions robust. it's a challenge since they LLM:s are nondeterministic, so they need a safety harness of robust code around them.
@charlesd774 Год назад
@@andersberg756 yeah the hallucination problem is something we need to figure out in order to get to the good stuff inside the model. So far leveraging current research papers to ground the model in reality seems to work barely acceptable...
@StoutProper Год назад
@@andersberg756 where can I find that podcast please?
@StoutProper Год назад
Do you do this function with specific prompt which you invoke via code then connect the output from? As in via Lang chain or huggingface?
@charlesd774 Год назад
@@StoutProper I use OpenAI API and ask it "Please do . Simply reply "YES" if , and "NO" otherwise" and then its output is deterministic
@theguildedcage Год назад ⁺¹
Thank you for your comprehensive analysis of the "Tree of Thoughts" prompting technique. Your explanation of how the technique works and its potential applications provides a valuable perspective on the evolution of large language models.
I agree with your assessment that this technique represents a significant advancement in the field, particularly in terms of improving the accuracy of logic and reasoning problems. The ability of the technique to allow large language models to think through problems in multiple steps and examine different paths to a solution is indeed a game-changer.
However, as you rightly pointed out, the technique does have its limitations, particularly in terms of resource requirements. It will be interesting to see how these challenges are addressed as the technique continues to be refined and developed.
I look forward to your future analyses and insights into the ever-evolving world of artificial intelligence.
@AmyWhickerTutor Год назад ⁺¹
😐
@SilverBullet46 Год назад ⁺²
In the last few days I've seen chat gpt4 use more and more of these techniques by itself, it seems to think more before giving the actual output and the results are generally much better.
@scitechtalktv9742 Год назад ⁺³
Could we use a free open source LLM instead of the OPENAI paid ones? Which free LLM would you recommend?
@MetaphoricMinds Год назад
My first time on this channel. Quality content, sir!
@jasonsalgado4917 Год назад ⁺¹
this is very valuable. great job, man. subscribed!
@matthew_berman Год назад
Thanks, Jason!
@hendrikbonthuys9190 Год назад ⁺⁴
Thanks Matthew, looks to be a very interesting research paper that will really help improve LLMs in upcoming releases.
@matthew_berman Год назад ⁺¹
You're welcome!
@thethree60five Год назад
@@matthew_berman If this ToT method via alogrythm has the ability to 'rethink' an input;
wouldn't it make sense to _ask questions of the user_ occasionally
at that point of 'rethink' to add focus of the output path as well as end result.
IE
"Do you mean..? "
"Do you want to include [found new factor] aspects and influence? "
Guided Shot method.
@andersberg756 Год назад ⁺¹
@@thethree60five could be an option for it if it reaches a state where no output up till then seems promising?
So reason could be: good solution seems improbable or new factor/idea which the user probably should have mentioned comes up?
@boukm3n Год назад ⁺⁷
Now the challenge is to create an architecture where I can enforce this schema. Probably gonna try and tackle it this month
@matthew_berman Год назад ⁺³
You can start from the repo I shared, at least you won't have to build it from scratch.
@Ricolaaaaaaaaaaaaaaaaa Год назад ⁺³
Better start within the next day or two or someone else will have it done by the end of the week.
@Syphronix Год назад ⁺³
Really fascinating to hear about Chain of Thought as a method after independently creating the same type of method using GPT-4 myself.
@matthew_berman Год назад
Do you do it manually with prompts?
@mubeensgh Год назад ⁺²
One disadvantage of CoT is infinite loops. No matter how I try to code it, the decision tree always tries to optimise the answer without stopping. If I hardcode the loop to stop after x amount of times, I get a wrong answer. If x is small, the answer is straight up wrong. If x is too big, the answer is an over optimised mess that is unwieldy.
I wonder how ToT solves this.
@interestedinstuff Год назад
It is super interesting that an LLM ( of sufficient size and complexity) can answer some questions really well, so if we can give it a structure so that only the questions it can answer are used, we can get a result. Very clever.
((Re your video. Your audio levels kept going up and down by a little bit between edit chunks. Assuming it was the one session and same audio set up, it means it happened during editing. Great video. Keep it up. Very interesting stuff))
@unc_matteth Год назад ⁺¹
great breakdown of the article thank you!!
@matthew_berman Год назад
You're very welcome, Matt!
@sirishkumar-m5z 2 месяца назад
Amazing progress! There are several options to consider if you're searching for substitute AI models or instruments that can improve reasoning and logic.
@benpope10 Год назад ⁺³
Great video, thank you! Are you planning on doing a code deep-dive for when the repo comes out?
@madushandissanayake96 Год назад ⁺³
This is like how neural network based chess engines work. But remember no human on the earth can beat top chess engines.
@redbaron3555 Год назад
This approach reminds me of the path navigation algorithm in navigation systems.
@DelandaBaudLacanian Год назад ⁺²
is it fair to say that the 3 types of prompting techniques are currently tree-of-thought, chain-of-thought, and standard? Or are chain-of-thought and standard the same? Thank you so much for the informational video!
@matthew_berman Год назад ⁺⁴
Thank you! There are a bunch of types of prompting techniques. Check out this comprehensive doc here: github.com/dair-ai/Prompt-Engineering-Guide
@stevewall7044 9 месяцев назад
They just have to incorporate reflections:
It should react to the assignment by reflecting all the reasons first thru "questioning" and then create a "story" in which the answer solves the issue.
@cowlevelcrypto2346 Год назад
Awesome! This will solve so many problems with hallucination.
P.S. I love your video transitions. Can I ask what program you are using?
@russellcameronthomas2116 Год назад
It is amazing how fast the field is evolving, and in ways that seem beyond anyone's plan or imagination. And beyond any firm's control. Contrast this with Wall Street, who seem to believe they know who the "AI Winners" will be.
@mhcbon4606 Год назад ⁺¹⁰
Could they train the ToT against some selected problems to create a sort of LLM-of-thoughts, a kind of meta-llm, a cognitive-predictive-model ?
@matthew_berman Год назад ⁺¹²
Interesting. So fine-tuning a model using ToT prompts?
@marc_frank Год назад ⁺³
this old Tony?
@StoutProper Год назад
@@marc_frank old Tony?
@8eck Год назад
Damn, this is insane! Very interesting idea and a work-around for existing problem. Looks very similar to unsupervised models and algorithms. Genetic algorithms etc.
@xXWillyxWonkaXx Год назад
@Matthew Berman do you think with this pace of development and research being published every week we'll get closer to a true AGI model inevitably ?
@mechadense Год назад ⁺¹
Also interesting would be letting the LLM generate a list of cons on common aspects to all its (first step) answers weighing these cons in severity and use the final weighted sum scores to pick. Something along these lines.
@mechadense Год назад
Also self balance breadth search & depth search and even decide what thinking method to use in the first place (as mentioned ToT is not needed for everything).
@andersberg756 Год назад
@@mechadense yeah, great idea! a problem grader function, which suggests the fastest, simplest, cheapest way to solve! then, if that method fails, it goes on to nr 2 in the list until maybe hitting a cost/time limit. The smart generalized general problem solver...
@noahsmith4505 Год назад ⁺⁵
good breakdown of ToT, sub'd. It's worth noting that decision making can be a complex process that often requires additional techniques and analysis. Factors such as data analysis, statistical modeling, risk assessment, and forecasting play crucial roles in making sound decisions. The application of Breadth-first and Depth-first search should be combined with other methodologies and domain-specific knowledge to make effective and informed decisions.
@StoutProper Год назад
That’s a really interesting concept, i like where you’re going with this. Have you got any examples or suggestions where I can go to find out more detail? In think you’re on to something here. What’s your background, if you don’t mind me asking?
@noahsmith4505 Год назад
@@StoutProper I'm a CPA/chartered accountant working on a project that's not yet public. Think about financial projections and going concern risk.
@StoutProper Год назад ⁺¹
@@noahsmith4505 I worked in risk engineering and capital exposure during the 2008 crisis, so I’ve got an idea. Using AI to make/ assess predictions and risk associated with businesses/ customers, and make decisions based on those assessments for commercial or regulatory compliance or fraud detection reasons? Sounds interesting. You got any jobs going?
@noahsmith4505 Год назад
@@StoutProper interesting thoughts. what's your email?
@StoutProper Год назад
@@noahsmith4505 broadcast my email, I meant
@psychxx7146 Год назад ⁺²
Suddenly, context window limit appears
@matthew_berman Год назад ⁺²
Haha. That’ll improve. Also, that’s why doing this with code and local storage is important for now.
@HassanAmin77 Год назад ⁺²
Thank you Matthew Berman, I like your style of explaining things and love/share your passion for LLMs. How can we use CoT or ToT approaches with langchain ?
@aoeu256 Год назад ⁺³
Imagine advance tree of thought + neurallink + replicating robots (that turn sahara into solar panels + computers)...
@incription Год назад
????
@nathanielalderson9111 Год назад
Or back into the forest it used to be
@AlexanderBukh Год назад
Drop the "surprised face" thumbnail and this video will be perfect.
@reyalsregnava Год назад ⁺¹
And there's the framework to put LLMs onto quantum computing platforms.
Probably herculean task to adapt the program structure. If only there was an advanced computing system that could adapt the program from one platform to another if you just told it to..
@alx8439 Год назад
The good thing about it is that this approach mimics what humans actually do. And this somewhat guarantees the success imho. Can't wait to play with it using some freely available cameloid model underneath, instead of GPT
@traich Год назад
Here is a silly question - I pay for the GPTplus service but it seems to be decoupled from the API billing. When I followed the process described here I ran into a wall stating that I need to setup a billing method even for the GPT-e model. SO here is my question - what is your rough estimation as to what this depth search might accumulate as costs? Is there a way to interact with the API without paying for it as well at all or ?
Thank you in advance
@YouuRayy Год назад ⁺³
this is it guys, we're in Singularity now
@matthew_berman Год назад
Lol...not quite yet!
@Wanderer2035 Год назад ⁺²
Once AI cures cancer then I’ll be convinced we’re in the singularity
@matthew_berman Год назад ⁺¹
@@Wanderer2035 Here's hoping!!
@YouuRayy Год назад
Singularity is when technology is able to establish self-improving evolutionary loop without any human interaction. Which I think with ToT is now practically possible. This loop can then run 24/7 doing nothing else but improving/evolving itself, at the speed of computing, physical experimentation and fabrication, which increases as it's evolving. Again, this is a theoretical concept only because humans definitely need to be in that loop to steer and control it. But I think ToT in a neverending loop is the key inflection point here.
@tvwithtiffani Месяц назад
This paper was probably the basis of a lot of the REASONING we're seeing from SOTA AI services like OpenAI and Anthropic's Claude today end of 3Q 2024. Not a single model. But a system.
@timurzaynutdinov3445 Год назад
Developing of Skynet from Terminator have just started.
@rustyxof Год назад ⁺²
Yesterday I heard another RUclipsr just say the words tree of thought my ears twitched a quick rush of blood I jumped to rewind did I hear that right? I began to dig in and since then pop pop videos on tree of thought….yet I am still lost as to how to implement this. I tried Complie no luck. This all looks like a refined auto gpt…how do we shoehorn tree of thought into autos? Call em “Hive ai”
@SustainaBIT 4 месяца назад
Hi there, this video is 1 yr ago and I'm so invested in ai reasoning, is there an update on this topic??
@mhcbon4606 Год назад ⁺¹
that begs the question about the way our brain is functioning. I hardly believe we do process all the branches of such problem, it is more likely it make uses of some sort of biological implementation of a TSP alg.
@matthew_berman Год назад ⁺¹
When solving hard problems, I believe we do go through this type of permutation tree.
@nosult3220 Год назад ⁺³
I built this on my git hub!
@matthew_berman Год назад
Can you share?
@Candyapplebone Год назад
To solve what problem?
@Ali-ts6po Год назад
I think it would be nice to edit the description and link to the official repo (as well). It seems there is a fight among authors and owner of the reference repo on this issue. Then people can decide which one to use.
@a--------------- Год назад
Like, you ask the LLM “how many letters will be in the following response to this prompt”(LLM’s can’t solve this prompt atm, unless we try the following), then the LLM gives the output to itself which then it asks itself how many letters were in my output, and THEN it will provide the answer to you. But is this really true reasoning when it’s answering upon its given answer then repeat?
@familyshare3724 Год назад ⁺²
Who would've thought Monte-Carlo would improve a language model. :/ The AI needs to learn from prior thinking. I'll be impressed when LLM beats Alpha Go and then terrified.
@andersberg756 Год назад
yes, cross-domain thinking, bunching up unlikely ppl, that's the way to go! Probably were some old-school cs ppl who came up with this mashup, don't you think?
@familyshare3724 Год назад
@@andersberg756 It should be obvious. LLM alone can't do simple math, much less reason. At least math, like a board game, can be proven, and thus a good candidate for reinforcement learning. It's interesting HOW it learns, but not THAT a computer can learn fx math.
@stevejordan7275 Год назад
I've just discovered your PrivateGPT install talk, and was planning to use it for everything, including this. (Ultimately, I'd like to build the "brain in a box" AGI, and I want to airgap it from the world until I'm confident I've built a *Friendly* AGI.)
Is there any reason why ToT would not work with PGPT, or is there anything different I should be doing to make sure it works? (At least, anything that occurs to you without investing too much skull sweat in it.)
@nunoalexandre6408 Год назад
Love it!!!!!!!!!!!!!!!!
@freedom_aint_free Год назад
I hope that LLM get the capacity of logical induction of theorem provers like COQ, their power of programming will skyrocket by orders of magnitude!
@manbirsingh6884 Год назад
So can you write tree of thought based prompts, on the chatbot? Or do u need code..
@novantha1 Год назад
If one described the benefits of tree of thought, I wonder if an LLM could evaluate if a given problem would benefit from it 🤔
If so, I could see it being useful to throw it into a variety of projects as a catch all in case the LLM runs into something it wouldn't normally find simple to solve.
@gileneusz Год назад
Hey there! I absolutely love your content and appreciate the hard work you put into it. However, I feel that the thumbnail for this video doesn’t quite do justice to the amazing work you create. In the future, a more suitable thumbnail could help maintain the respect your content deserves. Keep up the great work!
@antoniorosado 2 месяца назад
00:00 Tree of Thought enables large language models to perform deliberate decision making
02:40 Different types of prompting for language models
05:08 Using thought decomposition and state evaluation with large language models to solve complex problems
07:48 Tree of thought has two algorithms: breadth first search and depth first search.
10:19 Tree of Thought outperforms Chain of Thought in creative writing tasks
12:47 Embracing challenges and clever tactics for avoiding unwanted attention
15:12 Using thought sampling and value feedback to enable effective search inside a solution tree.
17:32 Implementing Tree of Thoughts for problem solving
@falklumo Год назад ⁺¹
The original paper seems to be "arXiv:2305.08291v1 [cs.AI] 15 May 2023" which is NOT DeepMind and was published 2 days prior to the work you cite here ...
@matthew_berman Год назад
Interesting...what do you think the paper I was reading is then?
@PatrizioR Год назад ⁺¹
Interesting I found the paper you mentioned last Friday and didn't know about the one mentioned here. I wonder how could two different groups come up with the same idea. I didn't cross check the references, could be Tree of Thought is a term already used somewhere else? Or is there someone stealing 😅
@matthew_berman Год назад
@@PatrizioR Yea it's interesting...i'm not sure what's going on with that.
@falklumo Год назад
@@matthew_berman I am sure the backstory behind both papers would make for a blockbuster YT video ;) Maybe, contact the author of the first paper, he's from a startup.
Moreover, according to "Theory of Consciousness" by F.Langhammer, ToT could be first model which crosses the threshold for consciousness by a tiny amount. Ie., a historical milestone.
@claudiodev8094 Год назад
well... that's how I've been using GPT from start. Get some ideas and explore on those on other threads. Makes perfect sense
@proterotype Год назад
I think it’s crazy we essentially used Tree of Thought to come to the conclusion Tree of Thought was the best at prompts
@8eck Год назад
I guess some additional comments in the example code would be great, hope they will add more docs and explanations in the code or in github.
@poni7834 Год назад
where did find the transitions of your videos? thankyou
@darkstatehk Год назад ⁺²
We know what's next: Forest of Thoughts.
@matthew_berman Год назад ⁺¹
😂
@JustTryGambling Год назад
We are here. The singularity. This is it
@changtimwu Год назад
Any reason not placing "tree of thoughts" in the video title?
@jamesjonnes Год назад ⁺¹
These are just prompts, you don't need code to do this. You can tell the AI to backtrack on answers, but just going forward on the tree already gives amazing results. An example:
Prompt: Try to do [task] 3 times.
Prompt: Choose the best try and improve it 3 times.
...
@jamesjonnes Год назад ⁺¹
Also, asking it to decompose the problem doesn't always improve the answers. It depends on the problem. For math it's probably better, but I had better results without it when asking about something like the meaning of life.
@matthew_berman Год назад ⁺¹
Yes, you can likely do everything manually via prompts, but it would be much more difficult than coding something. Keeping track of nodes, votes, current state, etc would not be easy manually.
@jamesjonnes Год назад
@@matthew_berman The AI can keep track of everything for you. Just tell it something like "From all your original and improved tries, pick the best one." it will do all the backtracking for you. You can also ask it to find the best manner in which to subdivide the text, elaborate each part, check if any are flawed, rejoin the parts, etc.
@andersberg756 Год назад
@@jamesjonnes not my experience at all, how do you prompt to get gpt to hold state well?
BREAK: I tried it out again to verify, but boy even 3.5-turbo did a good job - did they improve on this in recent updates?? I remember not having it count correctly... I used this prompt:
```
You are now tracking which part of speech class, i.e. nouns, verbs, adjectives which I use. Keep a table consisting of columns:
- part of speech class
- words found until now which pertain to the class
- number of words in that class
Words you can't classify you can put in a "unknown" word class. Await input from me, and for each input calculate the table with all the words of that sentence added and output the table.
```
Output of the sorting into part of speech was not totally consistent, but good, so as always depends on use case if sufficient. I'll try more examples as the one you outlined!
@hanskraut2018 Год назад
This is a very god start but it also needs automatic integration and pruning of the neural network based AI language model based combination AI (including all machine lerning stuff and regular tools if they are good enought to save computing)
@joeyx4056 Год назад ⁺²
Building the tree has O(n^n) time complexity😂
@Arkryal Год назад ⁺⁴
Yeah, you'd have to prune branches as you go, and limit responses, or the exponentially scaling complexity would make it too inefficient to be usable. This is just the old Minimax evaluation in a more abstracted implementation. There are many ways to optimize that logic though.
Instead of prompting
"Write some javascript to track mouse movements in the browser."
You could prompt:
"Explain three different accepted methods in javascript for tracking cursor movement in the browser, listing the pros and cons of each, including greatest browser compatibility, lowest runtime overhead, and lowest code complexity, scoring 2 point for the most efficient, 1 point for the second best, and zero points for the lowest scorer in each respetcive category"
"Of the three, choose the two with the highest score and generate a code example for each."
"Give me three suggestions to further optimize this code for performance and provide an example of the code for each."
"Of the six resultant examples, rate them as you did before, on greatest compatibility with browsers, lowest runtime overhead, and lowest code complexity. Choose the two best performers."
"Of the two best performers, suggest three methods to further optimize the code for efficiency"
... Repeat as needed.
Eventually, the code examples will converge and become identical. At that point you have reached the most efficient code the system is capable of producing with that prompt. Though more explicit prompting may yield better results.
It's a very simplistic example, but easy to conceptualize. Give me options, rank the performance of those options based on XYZ metrics, Refine, Rinse, Repeat until solutions converge.
The area for innovation here is to make that logic inherent to the evaluation of the system itself, rather than requiring explicit prompting to use such a method. And it won't work in all cases, it's still limited to its training data, and doesn't have the means to test its results directly, only to estimate based on metrics already held in it's training data. To use this at scale, AI needs a logical sandbox to evaluate it's own results based on user-defined metrics that may not be known.
@N0B0DY_SP3C14L Год назад
It should be noted that the output from the AI is just what it calculates has the highest probability of satisfying a problem's conditions.
@dr.fistingstein1566 Год назад
Currently and in past iterations chat GPT has not been good at simple arithmetic even. Why is the assumption that it can now do arithmetic ?
@asamirid Год назад ⁺¹
now i like that 😁😁.. Interesting paper ✅✅..
@matthew_berman Год назад
Yes, can't wait to put it into use!
@re_styles Год назад
My AI, SelfieGPT is my public genius digital self expression... "Tree of Thought x100"🤓
@jonurenawriter6108 Год назад ⁺¹
I've implemented a version of the tree of thoughts myself in Python. Too bad I can't get the OpenAI API to work *at all* in the last few days. All time readouts.
@matthew_berman Год назад
Willing to share the code?
@terry- Год назад ⁺¹
Nice!
@KlausRosenberg-et2xv Год назад ⁺¹
Soon to will not be necessary to be in the prompt anymore, it will be added as an update to the model.
@matthew_berman Год назад
Agreed. AI is changing so quickly it's hard to know what will "stick" in terms of value.
@Aristocle Год назад
Reasoning and heuristics cannot be in the same sentence. If I want to create an efficient algorithm for the reasoning on top of a LLM, the most efficient model I think is the deterministic graph.
@hanskraut2018 Год назад
They just need to automate and make „lerning part of every level and have researchers doing those things as „supervised“ or example or „take this and try it out with big priority or scrape elements from it into your neural network and go back to reenformemebt lerning based on metrics / performance.
A kinda team / teams / individuals, of humans and code all trying to maximize metrics not just humans doing things with 0 machine lerning being used
@vatanak8146 Год назад
Does it work with 3.5 as well?
@Utilitymatrix Год назад ⁺¹
A Guide ; contextual and tree structure parameters as in "following"
@matthew_berman Год назад
What?
@youtubebane7036 Год назад
Doesn't Agent gbt already do that?
@Hellawacked Год назад ⁺¹
Have you used it yet? It’s not 900% improvement. It’s a cool idea at the moment.
@matthew_berman Год назад
I have not done extensive testing yet.
@8eck Год назад
As far as i understand, this is the best solution for today. There isn't anything better atm, right?
@divyanshsh 10 месяцев назад
epic, thanks
@thomassynths Год назад ⁺²
Glad AI news is picking up again
@MaJetiGizzle Год назад ⁺²
You’re saying that like it ever slowed down.
@krellin Год назад
@@MaJetiGizzle to me its like exponential increase since last 6 months... not only it didnt stop its harder and harder to keep up
@matthew_berman Год назад ⁺¹
Haha yea it's hard to keep up, I don't think it slowed down :P
@oryxchannel Год назад
What kind of econometrics do we associate with an AI build from this new prompt tech? How do we use all of these new technologies if we aren't in the 1%?
@andersberg756 Год назад
I don't get your question, which 1%? I assume you're thinking about inequalities arising due to level of tech access? If so, I'd say it'll spread fast as it gets cheaper. Much research goes into making smaller LLM models which work well. Then it'll be a tool for the whole world to use in order to get more efficient. Like cellphones, solar power, ev batteries, which all got to the rich first but are moving towards commodities for the whole world.
@oryxchannel Год назад
@@andersberg756 How is anyone viewing an AI video being shown...on how to build something that supplants the "parallel innovation" of AI + the internet combined.? A. How do you plan for something bigger rather than smaller AI with your linked bank account to any AI vendor.? Continually smack a threshold setting every day, week, month? We have AI...so why aren't humans introducing ideas along with the exact amount of processing power you will be charged for etc etc?? B. Why build something in an extremely narrow and capitalist view on AI...only to have the exact model show up in your RUclips 'relateds' column unexpectedly? No one has an econometrics/tokenization AI..... AS YOU BUILD.
@studioopinions5870 Год назад
Hi Matthew, I see you keep saying it will be able to solve Math problems, by inserting Prompts, well way back in the day, before chat GPT was thought of, we had hand held Scientific Calculators, and Our computers today can solve math problems that we give them. So, Of course Chat GPT is simply using it's computer calculator to solve the math problem instantly, right? Now word problems? Yes, I grant, that's complex thinking. Nice for us to know how we can buy us a Logic Puzzle magazine, and see how chat GPT can solve that puzzle very quickly. Right? Terry
@utkarshshukla Год назад
this is what stockfish code implements.. in fact most of chess simulators are based own this logic.
@dianlongju Год назад
Discord link expired, please send again
@mort-ai Год назад ⁺²
great
@oblivion7300 Год назад
We keep trying to make something that processes better an faster than us to think like us. That’s where it’s limits lie.
@mhcbon4606 Год назад ⁺²
this where AI would really benefit of quantum computers.
@matthew_berman Год назад ⁺¹
Because of parallel computation?
@williambarnes5023 Год назад ⁺⁴
I need a much better computer before I try doing LLMs. Stable Diffusion makes my computer cry already.
@matthew_berman Год назад ⁺²
This is done with GPT-4, so no need to have a better computer.
@williambarnes5023 Год назад
@@matthew_berman I can't exactly run GPT-4 on my current computer, so I need a better one.
@prodbyhyppy Год назад ⁺¹
@@williambarnes5023 GPT4 isn’t reliant on your pc lol
@williambarnes5023 Год назад
@@prodbyhyppy Because I can't run it, yes. OpenAI can run it. But OpenAI censors. So I need a better computer, so I can run it, and not censor.
@falklumo Год назад ⁺¹
@@williambarnes5023 You can't run GPT-4, it is NOT open source! If you need to run open source, you fall back to smaller models with much weaker reasoning capabilities to start with.
@yw1971 Год назад
If this come from Google - Lets see it there
@Arkryal Год назад ⁺²
Ok, so just using Minimax eval on the prompt. Got it. That's all you had to say. That covers this 100%.
@matthew_berman Год назад ⁺¹
Can you elaborate?
@Arkryal Год назад ⁺²
@@matthew_berman Von Neuman's Minimax Theorem is the foundational principle in game-theory. There are many books written on the subject, so to explain all the nuances would be quite an undertaking, but I'll summarize.
You have a number of possible actions (in this case responses). Those are evaluated against some type defined parameter, and each is ranked a score, either -1, 0 or 1 respectively, based on that evaluation. The responses with a score of 1 are pursued. Those with -1 are avoided at all cost, as they are counter-intuitive to the defined objective, and those with a 0 may be pursued as they may produce better results in future iterations. The responses are fed back into the system and refined to generate multiple new responses. This is repeated as many times as is needed, pursuing the highest value answer.
This is how a computer AI might beat you at checkers. It will evaluate every possible move it can make, every response you could make, every response it could make to your response, etc. It will avoid moves that are counter to it's objective of winning, and pursue moves on a path through the tree of all possible moves most likely to result in it's victory.
This is computationally expensive. How many trillions of possible moves could happen in a game of checkers, or hands in poker, or military actions in a war, or market fluctuations on Wallstreet? So you have to "Prune" branches that are not advantageous to follow. That way they're not continually re-evaluated. You will also want to limit the number of recursions. Looking 3 moves ahead in checkers is likely enough to make an informed decision on what to do next. Looking 100 moves ahead has severely diminishing returns and is orders of magnitude more demanding on the system. So branch-pruning and optimization for the task are essential.
In some ways, it's like creating a false adversarial network, You're just using the LLM as it's own discriminator and not significantly re-weighting the inputs. But it would produce a degree of procedural refinement.
The trick to this is refining the syntax of the inputs to constrain the results so i's not too expensive in processing overhead, and to make sur ethe parameters chosen for refinement will lead to the desired results. That's tricky to do when the input is the English language. But a system that could make some general assumptions about a question (perhaps based on feedback from similar questions), could yield some pretty good results.
But the core logic of this proposed method has been known since 1928. It's standard reading in most computer science educations (and business management, economics, military officers, etc). Game Theory and AI are two very closely related fields of logic.
@Arkryal Год назад ⁺²
For example, you want GPT to give you some code to calculate Pi.
There are many ways to do that, some are faster than others, some are more accurate than others. It's a common task given to mathematics and computer science students, so most of the responses will be the simplest answer to write out or code, but may not be the best response for your application.
You may prompt:
Give me three functions in C# for calculating pi to 1000 decimal places.
Then ask it to evaluate those functions against a pre-calculated value to check for accuracy, speed, etc. Some responses may use the GPU instead of the CPU which will be much faster for this task. So (Assuming the LLM can actually execute the code and benchmark it, which I know isn't standard function right now without third party tools), you find that of the three methods proposed, one is accurate and fast, one is fast, but one digit off in the last decimal place, and the other is accurate but much slower than the others. You would rank them accordingly.
Then you feed the two best options (pruning the worst branch) back into the system, asking it to refine the methods used for greater efficiency, more manageable code, etc. Get 3 new copies of your code and evaluate them again. The parent branch with the lowest sum of result scores is pruned, then you take the two best options from the superior parent branch, and refine again. You can do this over and over until:
A) the system no longer gives valid results
B) the system cannot produce a different answer.
C) the answers converge and all results are identical.
Option C is ideal. That means you've hit the peak efficiency obtainable with that prompt, that model and that training data. Without changing one of those things, that answer is as good as you will get.
It's just computationally very expensive, and not necessarily suited for most tasks. But it is good at checking the model itself for consistency of logic and building better training data for future iterations of the model, which in the next release will know the best method to use because it will be biased toward a precalculated, verified method, even if it isn't the exact same question next time.
@cathompson58 Год назад
I can't get Chat GPT to make even simple judgements on analysis of data in my area of research .. it is embarrassingly bad at responding to complex emails and seems to be getting worse .. it is probably good at spoon fed algorithms like programming and crunching data but so far not impressed .. it does pretty good poems though

Следующие

Автовоспроизведение

New 🧑‍💻 Smol AI Developer - Build ENTIRE Codebases With A Single Prompt (ChatGPT)