AlphaGeometry: Solving olympiad geometry without human demonstrations (Paper Explained)

Yannic Kilcher

Просмотров 36 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 30 сен 2024

Комментарии • 72

@YannicKilcher 8 месяцев назад ⁺⁸
OUTLINE:
0:00 - Introduction
1:30 - Problem Statement
7:30 - Core Contribution: Synthetic Data Generation
9:30 - Sampling Premises
13:00 - Symbolic Deduction
17:00 - Traceback
19:00 - Auxiliary Construction
25:20 - Experimental Results
32:00 - Problem Representation
34:30 - Final Comments
@Buddharta 8 месяцев назад ⁺²⁴
Geometry facts:
The first trangle is isoceles not equilateral.
The triagles A and B at around 14:00 are congruent (have the same angles).
Four points are cyclic (are in a circle) if the sum of opposite angles are 180°.
@warrenarnold 8 месяцев назад ⁺³
Good thing you kept attention in class coz that thing is still coming to replace you 💔
@costadekiko 8 месяцев назад
Adding here that opposite angles summing to 180 is only one of the criteria for cyclic quadrilaterals, and is the one that applies for EADH (two right angles E and D).
The one that most easily applies to EBCD: a side of the quadrilateral is “under” the other two points with equal angles (in this case BC is “under” E and D with right angles).
@RajarshiBose 8 месяцев назад
@Buddharta congruent doesn't only mean triangles with same angles though they will have same angle. If two triangle are congruent, you can place one triangle completely on other one.(they are equal in terms of every thing, sides angles). If angles are same , triangles are called similar.
Two similar triangle might not be congruent.
@Buddharta 8 месяцев назад
@@RajarshiBose true, the triangles are similar and not congruent. I confused the definitions
@-mwolf 8 месяцев назад ⁺⁸
There might be an application in chip design actually...?
@piratepartyftw 8 месяцев назад ⁺²⁷
I expect this time next year (or sooner?), we'll be reading a similar paper where someone applied this kind of method to mathematics more broadly, leveraging something like the Lean language. It might be Coq or Isabelle instead, though, as those languages have more mature "autoprovers" of the kind used in this paper.
@piratepartyftw 8 месяцев назад ⁺⁶
My money's on Lean, long term. It's growing fastest and has the best developer ecosystem, and it's got a cohesive single library of proofs that can act as training data. The thing its missing is a good autoprover and a good tokenizer (it's hard to tokenize because it uses tons of unicode), but those are both actively being developed.
@HughBlackstone-tm6bw 7 месяцев назад
That's on God
@paxdriver 8 месяцев назад ⁺²⁷
Thanks Yan, I love these paper review videos.
@mahiainti678 8 месяцев назад ⁺⁹
Very interesting paper. While quite limited, this approach leaves no space for any hallucinations/inaccuracies which is a great plus. I'm interested how/where it can be further applied.
@asdfasdfasdf1218 8 месяцев назад ⁺²
Not much further yet because it's narrow, it can only find auxiliary points for 2D geometry problems. I'm sure deepmind is working on AI for math in general though, but it'll likely be in increments, step-by-step. But it'll definitely be huge if it can ever master all of math, not just 2D geometry.
@AndyHOTlife 8 месяцев назад ⁺⁹
I remember that back in school most of the people my age had difficulties with geometry specifically due to these extra constructions. Some had such difficulties with these that several jokes inevitably emerged. It is very interesting to see that, similar to school pupils, an ML model also struggles with these. But unlike school pupils with their geometry homework, an ML model living in the cloud has enough compute power to propose and reason about several constructions in the blink of an eye. Thank you for the video!
@DrDanielCho-Kee 6 месяцев назад ⁺²
Great video very insightful! What programs do you use to record your videos? I love the paper backdrop with annotations with you at the bottom!
@googleyoutubechannel8554 8 месяцев назад ⁺²
The search space for these geometry problems, if you go about it even marginally reasonably, seems very small compared with the scale of data processing available in modern times needed for say 1024x1024 diffusion, seems very reasonable to 'solve' these tightly constrained problems.
@greengoblin9567 8 месяцев назад ⁺⁸
If the large language model is somehow able to change the symbolic engine and synthetic data, it can be adapted to any problem. For example the language model can switch between different symbolic engines that it has created on will continuously for better logical reasoning at inference.
@wege8409 8 месяцев назад ⁺⁹
The researchers in the paper said that readers should be advised that it may be difficult to adapt this solution to other domains. Geometry has really nice properties that allows them to do it this way. I'm sure we'll figure out something soon though.
@greengoblin9567 8 месяцев назад ⁺³
@@wege8409 I am talking about other domains of math. Basically, you can have a library of symbolic engines that are finely tuned, and you can toggle between them.
@wurstelei1356 8 месяцев назад
@@greengoblin9567 The paper also stated, that it required a lot compute power to get to that level. Would be interesting if the weights of this are released. I can find the Github page with code but no weights. Then one could try to adopt other domains.
@Gingnose 8 месяцев назад
I believe geometry has carry over through many fields of mathematics tho@@wege8409
@andreaterlizzi 8 месяцев назад ⁺¹
@@greengoblin9567 The problem is that other domains of math (e.g. calculus, linear algebra, differential topology) are very different from Euclidean geometry. The latter is a probably a self-complete logical system, and is MUCH less expressive than e.g. calculus, in Gödelian terms. This is because other (and all the relevant) domains of math are probably logically incomplete by the Gödel theorem, so it's impossible to have a set of statements that you can sample from to get a reasonable coverage of the domain. All the things that AlphaGeometry does with, say, 90% accuracy, a symbolic solver could have done efficiently with 100% accuracy (and that's why they could have exploited it for training the LLM).
@LeePenkman 8 месяцев назад ⁺³
I'm thinking most problem domains, even code can be considered following long reasoning chains but would likely involve fuzzy review steps like LLM code review etc as well to help speed it up
@meinbherpieg4723 6 месяцев назад ⁺¹
This video is criminally underappreciated. You have succinctly covered some of the most nuanced aspects of cutting edge AI research. Thank you.
@oncedidactic 8 месяцев назад ⁺²
Imagine a Go board of infinite size with randomly initialized "walls" defining a usable play area. This is alphaGo for that.
@boffo25 8 месяцев назад ⁺³
Beautiful lesson
@ruchirgarg2725 8 месяцев назад ⁺¹
Get a grip on geometry sensei.
@robertfontaine3650 8 месяцев назад ⁺¹
using agents and llms to drive tactical changes to achieve objectives seems to be everyone's trick of the week.
@albinoameise 5 месяцев назад
Can someone shed some light on why to use a LLM instead of some Q-Learning approach with a fixed action space? I don't see the reason why AlphaGeometry relies on LLMs to search the best action...
@CarmelSchettino 2 месяца назад
This has probably been already stated, but having three sides of a triangle congruent does not mean they are similar (well it does with a 1:1 ratio), but with all sides congruent, you know the triangles are congruent and therefore, all corresponding parts are congruent - thus the angles are congruent at the base of the Isosceles triangle (not equilateral)
@theosalmon 8 месяцев назад ⁺⁵
... suggest you capture a new profile picture around 8:20.
@onurozkan1077 8 месяцев назад
I think that reverse engineering could largely benefit from this approach(going from assembly to a language) especially if the target language is known(and imperative). Then a compiler becomes a solver and statements/expressions become premises(constructs if you will) in that sense. However i may be wrong...
@ai-interview-questions 8 месяцев назад ⁺¹
Thank you, Yan!
@luke.perkin.inventor 8 месяцев назад
This video has saddle geometry: it starts off with an exciting premise then rapidly becomes incredibly disappointing, a classical algorithm would be fine, then recovers at the end if the compute budget really was modest.
@AntonMilan 8 месяцев назад
Is the sole purpose of the LLM to suggest which construct to sample at each step? If so, why use an LLM at all? Is it because it is able to better approximate the optimal sampling strategy?
@continuallearning8366 3 месяца назад
ARC-AGI sent me her
@paxdriver 8 месяцев назад ⁺²
Isn't that an "isocolese" or something like that?
@和平和平-c4i 8 месяцев назад ⁺¹
Yeah, an Isoscele triangle. Yannick is not a highschool math teacher hahaha
@ekstrapolatoraproksymujacy412 8 месяцев назад ⁺¹
@@和平和平-c4i Yannich is not a native english speaker, I'm not one either and I didn't know it was called that in english either, but of course we both know what it is lol
@paxdriver 7 месяцев назад
It might actually be equilateral, i'm not a highschool math teacher either and i am a native english speaker lol
@lerpmmo 6 месяцев назад
We can apply same concept to train to write software like devin ai?
@volkerengels5298 8 месяцев назад
Can't think that proper - but this thing may help reasoning...
@amafuji 8 месяцев назад ⁺⁵
Stable Diffusion occasionally solves non-euclidian geometry. It's horrifying
@drdca8263 8 месяцев назад
Like, hyperbolic geometry, or spherical geometry, or...?
@leolrg2783 8 месяцев назад ⁺¹
riemann geometry
@drdca8263 8 месяцев назад
@@leolrg2783 hm? That’s more of a general topic than a specific geometry though?
Like, Euclidean geometry, spherical geometry, and hyperbolic geometry, are all described within it?
@Hexanitrobenzene 8 месяцев назад ⁺²
@amafuji
Could you provide some links ? Because your statement is rather bold.
@timmygilbert4102 7 месяцев назад ⁺¹
People... That's a joke about how mangled dime generation are 😂
@kaikapioka9711 8 месяцев назад
Thz!
@nevokrien95 8 месяцев назад
Isnt 25 75% of 30...
@Hexanitrobenzene 8 месяцев назад
No ? 25/30 = 5/6 =~0.8333...
@ShA-ib1em 7 месяцев назад ⁺²
the researchers seem to say that their model can solve 75% of the problems and then they say they have solved 25 out of the 30 problems in the competition.
However they seem to mean that 75% of all the problems can be represented by using the language that they choose to use.
And then they say they have included only the problems that can be represented by using the language they choose to use.
And they solved 25 of the 30 problems they tested the model on ...
@atypocrat1779 8 месяцев назад ⁺⁷
0:17: 📐 Innovative paper introduces Alpha Geometry model for solving challenging geometry problems without human demonstrations.
5:18: 🔍 Solving complex geometry problem without human demonstrations requires specific language model and training.
10:27: ⚙ Automated geometry problem solving using algebraic reasoning and a small set of representations.
15:57: 🔍 Logical deduction and proof process in geometry problem solving.
20:59: ⚙ Symbolic deduction and traceback in problem solving process.
25:58: 📝 Advanced system solves math Olympiad problems with lengthy proofs, outperforming previous methods.
31:32: ⚙ Challenges in adapting math problems to a specialized language for proof steps and the correlation of problem difficulty with the number of required steps.
Recap by Tammy AI
@vzxvzvcxasd7109 8 месяцев назад ⁺²
I was gonna say, that reply was way too quick.
Ai
@notu483 8 месяцев назад
Is Tammy better than Harpa?
@nikitastaf1996 8 месяцев назад ⁺¹
Take my words with grain of salt. I have deduced it with Claude's help. They are not solving imo per se. What they say with this paper:we can now reliably generate shitton amount of synthetic data(at least for geometry). We use extremely small model (it has around 1gb in size) that is stupid to help us with essentially bruteforce approach to solve geometry. Next my predictions:If we apply this approach to large model and on top of it we apply something like chain of thought (which they essentially do in this paper but they use bruteforce solver) it will have significantly higher performance on geometry problems than any human alive.
@otrqffaimajg 8 месяцев назад ⁺³
Every video makes my eyes water, I'm not used to light.
@Hexanitrobenzene 8 месяцев назад
Yeah, I use grey background for pdfs. I don't know any trick for videos, though...
@Ori-lp2fm 8 месяцев назад
That child molster finally losing his hairc
@nevokrien95 8 месяцев назад
This is ridiculously impressive olympiad is top mathematicians so an ai doing it is a HUGE break through by itself
@Jacobk-g7r 8 месяцев назад
Should look into sacred geometry, it sounds weird but if you understand, which its simple, then you can use its understanding like the ai to see things not in our current. Kinda like how the wright brothers saw a plane in the pieces of reality assisted by math and language and such to help find it.
@moeinhasani8718 7 месяцев назад
I wonder why a group like this still cares about getting their papers published in journals. They were done with this project in April and because of the paper, they released the code in December. That's roughly 9 months that the community could have accessed the source code but they waited on a journal acceptance.
@The_Conspiracy_Analyst 8 месяцев назад ⁺¹
Wow they discovered how to do backtracking, but the hard way. LOL nubs. I love "research" that is just like nubbing for stock pumps
@lilithstenhouse267 8 месяцев назад
Isn't the key that they managed to reduce the search space on a problem with an infinite search domain down far enough that it can be backtracked? It's certainly not widely applicable yet, but it's potentially a very powerful premise
@Hexanitrobenzene 8 месяцев назад
@@lilithstenhouse267
Yes, @telesniper2 is too dismissive for some reason. Brute force search would run into combinatorial explosion very quickly. LLMs are precisely for reducing the search space, that is, suggesting promising solutions.

Следующие

Автовоспроизведение

V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video (Explained)