Geometry facts: The first trangle is isoceles not equilateral. The triagles A and B at around 14:00 are congruent (have the same angles). Four points are cyclic (are in a circle) if the sum of opposite angles are 180°.
Adding here that opposite angles summing to 180 is only one of the criteria for cyclic quadrilaterals, and is the one that applies for EADH (two right angles E and D). The one that most easily applies to EBCD: a side of the quadrilateral is “under” the other two points with equal angles (in this case BC is “under” E and D with right angles).
@Buddharta congruent doesn't only mean triangles with same angles though they will have same angle. If two triangle are congruent, you can place one triangle completely on other one.(they are equal in terms of every thing, sides angles). If angles are same , triangles are called similar. Two similar triangle might not be congruent.
I expect this time next year (or sooner?), we'll be reading a similar paper where someone applied this kind of method to mathematics more broadly, leveraging something like the Lean language. It might be Coq or Isabelle instead, though, as those languages have more mature "autoprovers" of the kind used in this paper.
My money's on Lean, long term. It's growing fastest and has the best developer ecosystem, and it's got a cohesive single library of proofs that can act as training data. The thing its missing is a good autoprover and a good tokenizer (it's hard to tokenize because it uses tons of unicode), but those are both actively being developed.
Very interesting paper. While quite limited, this approach leaves no space for any hallucinations/inaccuracies which is a great plus. I'm interested how/where it can be further applied.
Not much further yet because it's narrow, it can only find auxiliary points for 2D geometry problems. I'm sure deepmind is working on AI for math in general though, but it'll likely be in increments, step-by-step. But it'll definitely be huge if it can ever master all of math, not just 2D geometry.
I remember that back in school most of the people my age had difficulties with geometry specifically due to these extra constructions. Some had such difficulties with these that several jokes inevitably emerged. It is very interesting to see that, similar to school pupils, an ML model also struggles with these. But unlike school pupils with their geometry homework, an ML model living in the cloud has enough compute power to propose and reason about several constructions in the blink of an eye. Thank you for the video!
The search space for these geometry problems, if you go about it even marginally reasonably, seems very small compared with the scale of data processing available in modern times needed for say 1024x1024 diffusion, seems very reasonable to 'solve' these tightly constrained problems.
If the large language model is somehow able to change the symbolic engine and synthetic data, it can be adapted to any problem. For example the language model can switch between different symbolic engines that it has created on will continuously for better logical reasoning at inference.
The researchers in the paper said that readers should be advised that it may be difficult to adapt this solution to other domains. Geometry has really nice properties that allows them to do it this way. I'm sure we'll figure out something soon though.
@@wege8409 I am talking about other domains of math. Basically, you can have a library of symbolic engines that are finely tuned, and you can toggle between them.
@@greengoblin9567 The paper also stated, that it required a lot compute power to get to that level. Would be interesting if the weights of this are released. I can find the Github page with code but no weights. Then one could try to adopt other domains.
@@greengoblin9567 The problem is that other domains of math (e.g. calculus, linear algebra, differential topology) are very different from Euclidean geometry. The latter is a probably a self-complete logical system, and is MUCH less expressive than e.g. calculus, in Gödelian terms. This is because other (and all the relevant) domains of math are probably logically incomplete by the Gödel theorem, so it's impossible to have a set of statements that you can sample from to get a reasonable coverage of the domain. All the things that AlphaGeometry does with, say, 90% accuracy, a symbolic solver could have done efficiently with 100% accuracy (and that's why they could have exploited it for training the LLM).
I'm thinking most problem domains, even code can be considered following long reasoning chains but would likely involve fuzzy review steps like LLM code review etc as well to help speed it up
Can someone shed some light on why to use a LLM instead of some Q-Learning approach with a fixed action space? I don't see the reason why AlphaGeometry relies on LLMs to search the best action...
This has probably been already stated, but having three sides of a triangle congruent does not mean they are similar (well it does with a 1:1 ratio), but with all sides congruent, you know the triangles are congruent and therefore, all corresponding parts are congruent - thus the angles are congruent at the base of the Isosceles triangle (not equilateral)
I think that reverse engineering could largely benefit from this approach(going from assembly to a language) especially if the target language is known(and imperative). Then a compiler becomes a solver and statements/expressions become premises(constructs if you will) in that sense. However i may be wrong...
This video has saddle geometry: it starts off with an exciting premise then rapidly becomes incredibly disappointing, a classical algorithm would be fine, then recovers at the end if the compute budget really was modest.
Is the sole purpose of the LLM to suggest which construct to sample at each step? If so, why use an LLM at all? Is it because it is able to better approximate the optimal sampling strategy?
@@和平和平-c4i Yannich is not a native english speaker, I'm not one either and I didn't know it was called that in english either, but of course we both know what it is lol
@@leolrg2783 hm? That’s more of a general topic than a specific geometry though? Like, Euclidean geometry, spherical geometry, and hyperbolic geometry, are all described within it?
the researchers seem to say that their model can solve 75% of the problems and then they say they have solved 25 out of the 30 problems in the competition. However they seem to mean that 75% of all the problems can be represented by using the language that they choose to use. And then they say they have included only the problems that can be represented by using the language they choose to use. And they solved 25 of the 30 problems they tested the model on ...
0:17: 📐 Innovative paper introduces Alpha Geometry model for solving challenging geometry problems without human demonstrations. 5:18: 🔍 Solving complex geometry problem without human demonstrations requires specific language model and training. 10:27: ⚙ Automated geometry problem solving using algebraic reasoning and a small set of representations. 15:57: 🔍 Logical deduction and proof process in geometry problem solving. 20:59: ⚙ Symbolic deduction and traceback in problem solving process. 25:58: 📝 Advanced system solves math Olympiad problems with lengthy proofs, outperforming previous methods. 31:32: ⚙ Challenges in adapting math problems to a specialized language for proof steps and the correlation of problem difficulty with the number of required steps. Recap by Tammy AI
Take my words with grain of salt. I have deduced it with Claude's help. They are not solving imo per se. What they say with this paper:we can now reliably generate shitton amount of synthetic data(at least for geometry). We use extremely small model (it has around 1gb in size) that is stupid to help us with essentially bruteforce approach to solve geometry. Next my predictions:If we apply this approach to large model and on top of it we apply something like chain of thought (which they essentially do in this paper but they use bruteforce solver) it will have significantly higher performance on geometry problems than any human alive.
Should look into sacred geometry, it sounds weird but if you understand, which its simple, then you can use its understanding like the ai to see things not in our current. Kinda like how the wright brothers saw a plane in the pieces of reality assisted by math and language and such to help find it.
I wonder why a group like this still cares about getting their papers published in journals. They were done with this project in April and because of the paper, they released the code in December. That's roughly 9 months that the community could have accessed the source code but they waited on a journal acceptance.
Isn't the key that they managed to reduce the search space on a problem with an infinite search domain down far enough that it can be backtracked? It's certainly not widely applicable yet, but it's potentially a very powerful premise
@@lilithstenhouse267 Yes, @telesniper2 is too dismissive for some reason. Brute force search would run into combinatorial explosion very quickly. LLMs are precisely for reducing the search space, that is, suggesting promising solutions.
OUTLINE:
0:00 - Introduction
1:30 - Problem Statement
7:30 - Core Contribution: Synthetic Data Generation
9:30 - Sampling Premises
13:00 - Symbolic Deduction
17:00 - Traceback
19:00 - Auxiliary Construction
25:20 - Experimental Results
32:00 - Problem Representation
34:30 - Final Comments
Geometry facts:
The first trangle is isoceles not equilateral.
The triagles A and B at around 14:00 are congruent (have the same angles).
Four points are cyclic (are in a circle) if the sum of opposite angles are 180°.
Good thing you kept attention in class coz that thing is still coming to replace you 💔
Adding here that opposite angles summing to 180 is only one of the criteria for cyclic quadrilaterals, and is the one that applies for EADH (two right angles E and D).
The one that most easily applies to EBCD: a side of the quadrilateral is “under” the other two points with equal angles (in this case BC is “under” E and D with right angles).
@Buddharta congruent doesn't only mean triangles with same angles though they will have same angle. If two triangle are congruent, you can place one triangle completely on other one.(they are equal in terms of every thing, sides angles). If angles are same , triangles are called similar.
Two similar triangle might not be congruent.
@@RajarshiBose true, the triangles are similar and not congruent. I confused the definitions
There might be an application in chip design actually...?
I expect this time next year (or sooner?), we'll be reading a similar paper where someone applied this kind of method to mathematics more broadly, leveraging something like the Lean language. It might be Coq or Isabelle instead, though, as those languages have more mature "autoprovers" of the kind used in this paper.
My money's on Lean, long term. It's growing fastest and has the best developer ecosystem, and it's got a cohesive single library of proofs that can act as training data. The thing its missing is a good autoprover and a good tokenizer (it's hard to tokenize because it uses tons of unicode), but those are both actively being developed.
That's on God
Thanks Yan, I love these paper review videos.
Very interesting paper. While quite limited, this approach leaves no space for any hallucinations/inaccuracies which is a great plus. I'm interested how/where it can be further applied.
Not much further yet because it's narrow, it can only find auxiliary points for 2D geometry problems. I'm sure deepmind is working on AI for math in general though, but it'll likely be in increments, step-by-step. But it'll definitely be huge if it can ever master all of math, not just 2D geometry.
I remember that back in school most of the people my age had difficulties with geometry specifically due to these extra constructions. Some had such difficulties with these that several jokes inevitably emerged. It is very interesting to see that, similar to school pupils, an ML model also struggles with these. But unlike school pupils with their geometry homework, an ML model living in the cloud has enough compute power to propose and reason about several constructions in the blink of an eye. Thank you for the video!
Great video very insightful! What programs do you use to record your videos? I love the paper backdrop with annotations with you at the bottom!
The search space for these geometry problems, if you go about it even marginally reasonably, seems very small compared with the scale of data processing available in modern times needed for say 1024x1024 diffusion, seems very reasonable to 'solve' these tightly constrained problems.
If the large language model is somehow able to change the symbolic engine and synthetic data, it can be adapted to any problem. For example the language model can switch between different symbolic engines that it has created on will continuously for better logical reasoning at inference.
The researchers in the paper said that readers should be advised that it may be difficult to adapt this solution to other domains. Geometry has really nice properties that allows them to do it this way. I'm sure we'll figure out something soon though.
@@wege8409 I am talking about other domains of math. Basically, you can have a library of symbolic engines that are finely tuned, and you can toggle between them.
@@greengoblin9567 The paper also stated, that it required a lot compute power to get to that level. Would be interesting if the weights of this are released. I can find the Github page with code but no weights. Then one could try to adopt other domains.
I believe geometry has carry over through many fields of mathematics tho@@wege8409
@@greengoblin9567 The problem is that other domains of math (e.g. calculus, linear algebra, differential topology) are very different from Euclidean geometry. The latter is a probably a self-complete logical system, and is MUCH less expressive than e.g. calculus, in Gödelian terms. This is because other (and all the relevant) domains of math are probably logically incomplete by the Gödel theorem, so it's impossible to have a set of statements that you can sample from to get a reasonable coverage of the domain. All the things that AlphaGeometry does with, say, 90% accuracy, a symbolic solver could have done efficiently with 100% accuracy (and that's why they could have exploited it for training the LLM).
I'm thinking most problem domains, even code can be considered following long reasoning chains but would likely involve fuzzy review steps like LLM code review etc as well to help speed it up
This video is criminally underappreciated. You have succinctly covered some of the most nuanced aspects of cutting edge AI research. Thank you.
Imagine a Go board of infinite size with randomly initialized "walls" defining a usable play area. This is alphaGo for that.
Beautiful lesson
Get a grip on geometry sensei.
using agents and llms to drive tactical changes to achieve objectives seems to be everyone's trick of the week.
Can someone shed some light on why to use a LLM instead of some Q-Learning approach with a fixed action space? I don't see the reason why AlphaGeometry relies on LLMs to search the best action...
This has probably been already stated, but having three sides of a triangle congruent does not mean they are similar (well it does with a 1:1 ratio), but with all sides congruent, you know the triangles are congruent and therefore, all corresponding parts are congruent - thus the angles are congruent at the base of the Isosceles triangle (not equilateral)
... suggest you capture a new profile picture around 8:20.
I think that reverse engineering could largely benefit from this approach(going from assembly to a language) especially if the target language is known(and imperative). Then a compiler becomes a solver and statements/expressions become premises(constructs if you will) in that sense. However i may be wrong...
Thank you, Yan!
This video has saddle geometry: it starts off with an exciting premise then rapidly becomes incredibly disappointing, a classical algorithm would be fine, then recovers at the end if the compute budget really was modest.
Is the sole purpose of the LLM to suggest which construct to sample at each step? If so, why use an LLM at all? Is it because it is able to better approximate the optimal sampling strategy?
ARC-AGI sent me her
Isn't that an "isocolese" or something like that?
Yeah, an Isoscele triangle. Yannick is not a highschool math teacher hahaha
@@和平和平-c4i Yannich is not a native english speaker, I'm not one either and I didn't know it was called that in english either, but of course we both know what it is lol
It might actually be equilateral, i'm not a highschool math teacher either and i am a native english speaker lol
We can apply same concept to train to write software like devin ai?
Can't think that proper - but this thing may help reasoning...
Stable Diffusion occasionally solves non-euclidian geometry. It's horrifying
Like, hyperbolic geometry, or spherical geometry, or...?
riemann geometry
@@leolrg2783 hm? That’s more of a general topic than a specific geometry though?
Like, Euclidean geometry, spherical geometry, and hyperbolic geometry, are all described within it?
@amafuji
Could you provide some links ? Because your statement is rather bold.
People... That's a joke about how mangled dime generation are 😂
Thz!
Isnt 25 75% of 30...
No ? 25/30 = 5/6 =~0.8333...
the researchers seem to say that their model can solve 75% of the problems and then they say they have solved 25 out of the 30 problems in the competition.
However they seem to mean that 75% of all the problems can be represented by using the language that they choose to use.
And then they say they have included only the problems that can be represented by using the language they choose to use.
And they solved 25 of the 30 problems they tested the model on ...
0:17: 📐 Innovative paper introduces Alpha Geometry model for solving challenging geometry problems without human demonstrations.
5:18: 🔍 Solving complex geometry problem without human demonstrations requires specific language model and training.
10:27: ⚙ Automated geometry problem solving using algebraic reasoning and a small set of representations.
15:57: 🔍 Logical deduction and proof process in geometry problem solving.
20:59: ⚙ Symbolic deduction and traceback in problem solving process.
25:58: 📝 Advanced system solves math Olympiad problems with lengthy proofs, outperforming previous methods.
31:32: ⚙ Challenges in adapting math problems to a specialized language for proof steps and the correlation of problem difficulty with the number of required steps.
Recap by Tammy AI
I was gonna say, that reply was way too quick.
Ai
Is Tammy better than Harpa?
Take my words with grain of salt. I have deduced it with Claude's help. They are not solving imo per se. What they say with this paper:we can now reliably generate shitton amount of synthetic data(at least for geometry). We use extremely small model (it has around 1gb in size) that is stupid to help us with essentially bruteforce approach to solve geometry. Next my predictions:If we apply this approach to large model and on top of it we apply something like chain of thought (which they essentially do in this paper but they use bruteforce solver) it will have significantly higher performance on geometry problems than any human alive.
Every video makes my eyes water, I'm not used to light.
Yeah, I use grey background for pdfs. I don't know any trick for videos, though...
That child molster finally losing his hairc
This is ridiculously impressive olympiad is top mathematicians so an ai doing it is a HUGE break through by itself
Should look into sacred geometry, it sounds weird but if you understand, which its simple, then you can use its understanding like the ai to see things not in our current. Kinda like how the wright brothers saw a plane in the pieces of reality assisted by math and language and such to help find it.
I wonder why a group like this still cares about getting their papers published in journals. They were done with this project in April and because of the paper, they released the code in December. That's roughly 9 months that the community could have accessed the source code but they waited on a journal acceptance.
Wow they discovered how to do backtracking, but the hard way. LOL nubs. I love "research" that is just like nubbing for stock pumps
Isn't the key that they managed to reduce the search space on a problem with an infinite search domain down far enough that it can be backtracked? It's certainly not widely applicable yet, but it's potentially a very powerful premise
@@lilithstenhouse267
Yes, @telesniper2 is too dismissive for some reason. Brute force search would run into combinatorial explosion very quickly. LLMs are precisely for reducing the search space, that is, suggesting promising solutions.