Without having considered the counts themselves, I would have expected it to be a larger channel given the way the video is structured and how precise some edits are. But that aside, it's interesting how we went from finding neural patterns to solve tasks to finding neural patterns that optimize finding neural patterns to solve tasks. Essentially analogous to a factory creating better factories. I'm aware that in public topics such as AI are usually only mentioned in a shallow manner, referencing typical generic developments of poor face or speech neural networks, even providing merely mediocre samples of them additionally, whereas there are so many more intricate creations from the past 10 yrs. While the media is often seen as immediate, it depicts new types of technology as concepts of the future or reduces their depth, even though they already are defining the present. Maybe too metaphorical, I just think very few people are aware of the state of models and have an insight on what is to come soon.
Thanks for the comment! Really appreciate it! Yeah, we are making progress at a breakneck speed. I can’t even imagine where we will be by the end of 2023.
It used to be a manual process. When transistor count increased, one of the first thing computer engineers did was to put 10x more logic gates into the multiplication part of the ALU by using many predicates like filling in a truth table. Having a known software method to optimize them further is nice, but be reminded that hardware made for the task is already super optimized with memory bandwidth and latency being the limit
I thought the point of this was that it could optimize on any hardware. Those optimizations can then be built into the next iteration of the hardware. I don't see this as a purely software optimization. Wait till the AI is capable of implementing those changes in hardware by itself...
When using truth tables as part of the digital design process, we get ridiculous logic gates as we add bits to the data buses we operate on. As far as I know, we do increase the speed of the custom circuitry, but differently, by adding more gates to the designs, but still I am not sure about perfect, being an engineer is realizing there is never a roof you can reach, I think.
Always high quality, high entropy content. DeepMind's application of Reinforcement Learning (RL) to solve problems is fascinating, and I agree that it will be interesting to see how they use RL moving forward: what other problems will be "gamified" in the future ...
Right?! I'm tempted to say "the next decade will be interesting" but who am I kidding: 2023 will be really interesting! Thanks for the comment! Appreciate your words!
A.I. could (likely already is) gamify the "human farming game," what neither Marx or Rothschild, nor any other man has totally figured out to perfection. The last 75+ years of Technocrats and Cyberneticians have indeed made tremendous documented progress in innovating improvements to the human farm; yet, just imagine what machine learning from and for us (and our digital twins) could bring forth from the human stock. Then maybe, by the time we get elderly the brain to silicon tech will be ready for perfect virtual eternity with our tremendous A.I. pals. No belief in God needed or expected.
I want DeepMind to build an AI that takes as input a training set and labelled dataset, and as output gives you the best hyperparams to learn that data with the lowest loss and best generalisation error.
Can AlphaTensor solve x265 decode on Playstation 3? There are no good compilers, even IBM had separate compilers for the PPE and the SPEs in their own CellBE CPU and nVidia kept the architecture details of the G70 confidential. Better yet, can AlphaTensor produce an LLVM backend for Playstation 3? Or optimise LLVM backends more generally?
I loved your hardware interpretation of the results because today IBM and basically everyone is trying to make better chips for matrix mul tasks from ASICs to Accelerators, and a new algorithm eliminates a huge compute cost 💯💯. Thanks a ton for the video, I too am working with matrices this moment.
I am stunned, now subbed and shocked by how juicy and fun to watch && also insightful you made this video, you brought us through the whole journey in minutes and explained every obstacle. On top of that, your visuals and audio is fantastic, extremely high quality content you produce. Please more :)
Thank you. I heard about the article when it came out, but didn't realise the implications at a practical level. The machine improving its own hardware and software to get better. That's the path to singularity
Your channel and your tweets are really helpful in my ML learning process. The quality of your videos is really high and the content is incredible useful to better understand ML. Vamos Santiago!
I think I understand the implication? We're talking about an algorithms ability to adapt, right? We can see that in some neural networks that novel behavior appears after millions of millions of repetitions, which I guess is similar to adaptation. But a AI's ability to use different techniques on different problems as a fundamental feature versus a result of extensive training might be a big deal. I really enjoyed the video. You kept me interested and whisked me along your train of thought with very little effort on my part. Perfect story telling, great explanation of all the necessary topics and you made it engaging. Awesome work.
I wonder if these multiplication optimizations are useable in GPGPU. Since GPGPU can calculate each element independently in parallel, keeping the operations independent is important. I don't know as much about these multiplication algorithms, but maybe they can be parallelized in their own way.
We already use GPUs (or TPUs) for all this. Alphatensor is computed with billions of f16 mat4 calcs on your gpu, and its results could be used to optimize the next iteration of hardware pipelines, of course?...(or perhaps, less usefully, emulated on the same device.) Why couldn't we bootstrap math and theory, its how these devices make improvements in the first place? Gpgpu, yeah. I don't really get what you're trying to say tbh.
Provocative presentation - instant subscriber. Things you said that caught my attention: 1. Concisely stated the problem, i.e. computation time for matrix multiplication 2. Nice historical summary of approaches to solve the stated problem 3. Clear understanding of "small gains" may yield large rewards. Examples of applying this "algorithm optimization method": solving complex problems found in general relativity and fluid mechanics (non-linear differential tensor and vector systems) or standard quantum mechanics (or string/membrane theory) 4. Your enthusiastic imagination; unafraid to jump ahead to what might be next Looking forward to what else grabs your attention
In quantum computing, when we introduce noise to the system, the calculation of the density matrix for the state of the quantum system grow exponential. Maybe this will help a lot in that field.
Did the paper specify which method was found by Alphatensor for each matrix dimension? Did they make that information public or just the minimum number of multiplications found?
Great video, thanks for making it! I have a small nitpick, which you do consistently: x^2 - y^2 is not an equation, where's the equals sign? You can call it a polynomial, or say you want to compute that expression, but there is no equation there to solve.
There is also some neat work out there to make multiplication and things like sin, cos, tan much faster. Autonomous driving is a lot of 8x8 matrix multiplication and we have algorithm/logic to do this very specific thing. But it's a really hard engineering challange and tradeoff between big-complex-slower chips that offer matrix multiply and smaller-simpler-faster chips that only offer really basic math but stupidly fast. Which one is more efficient really comes down to the job of the hardware and the algorithm at hand.
It's limited by the fact that each algorithm it finds differs depending on the matric dimensions? So I don't see a wide application of this for math, unless we stop each algorithm for each matric shape. Interesting, yes but not ground breaking
For machine learning is very important. Finding a different algorithm depending on the dimensions of the matrices is more a feature than a bug. The goal here is speed, not generalizability. The other aspect of it that I really like is the application of AlphaZero on a different domain than games. It opens many possibilities for us!
4:17 This is not quite true. There are algorithms with a better time complexity: O(n^2.4) vs. Strassen's O(n^2.8). The problem is that these theoretically faster algorithm only outperform Strassen for truly gigantic matrices, which is why they aren't used in practice.
@@underfitted I'm referring to your quote "53 years later and we still don't know if there is a better way to multiply matrices." at roughly that moment in the video. I guess it depends on your definition of "better" but asymptotically faster matrix multiplication algorithms than Strassen have been discovered.
i have read that AI hardware is going to move to analog signals, because you can do multiplication just passing a current through a resistance. The other problem of matrix multiplication is the accumulated error provoked by floating point representation in binary (digital) system.
Haven't read the paper but I am curious how well the new algorithms are numerically. Some algorithms that are fast are also not practical to use due to their poor numerical stability.
Machine learning does not care about stability. They are even using 8bit int instead of 32bit or 16bit float to do the inference calculation to reduce the hardware needed. Going from 16bit training to 8bit inference have very poor effect on numerical stability.
To what extent can Alpha Tensor help with very large matrices? On very large matrices I got a 40% speed up simply using cache coherent operation ordering.
Gee, back in the mid 1970's (high school), before we all had home computers, I gave myself a project of reducing the number of math operations required to solve large multiplication problems. Came up with some reductions, but did not have much use for it at the time. Now I'm trying to remember just what I did. Question: In general, does AI come up with solutions because it tries all the possibilities, eliminating those processes that apparently don't work, while optimizing the procedures that show some promise? For example if the problem was to come up with sets of 3 positive biquadrates that sum to another biquadrate, would AI just test combinations of numbers or would it actually develop a parametric equation if possible?
And no, this hasn't made any significant dent on the problem of multiplication. Even in their own hand picked selected few problem the gain is what like 6 to 8 percent. Thats not ground breaking. One gets that speed up or down just in the cache access.
@@underfitted i read it thatd why i said. The max improvement that is reported is like 18 percent. Thats not ground breaking. Not even close. I have a patent with my company where i take search in 1 dimension array and have it searched in o(3)to o(4) time regardless of data size.(once the data structure has to be constructed). The same thing usually is done with binary search that is size dependent. This is called improvement. Not 10 to 20 percent cost improvement. I would not even count this as anything if it doesn't really give good advantage.
Aside from the (mis)use of "equation" which you must have got for the millionth time by now, can you check your expression for M6 in the case of the 2 by 2 matrix? I think it should be a negative sign in the first pair of brackets in M6 just like M7, otherwise you don't get to calculate the ab_22 element in the resulting matrix. Also, showing how to combine M1 to M7 to get the 4 elements might be good (as much as one who is interested can simply look it up and verify from Wikipedia).
Lemme just say... Sir you are amazing🤩 The hardwork and information with atmost detail with creative editing makes this video very very engaging and connecting. Hats 👑⛑👒🎩 off to your effort. Subscribed right away. 😀😄
The conclusion seems very similar to what Stephen Wolfram says in a New Kind of science: start searching in the whole set of simple programs for the one that fits the behavior you are looking for. Boring task -> give it to a program.
We actually have this with generic algorithm and some strategies of system identification in engineering already. It's just finding the right question and initial data as well as proving your results is often times harder than solving the problem in the first place. But this approach really helps with discovering math and physics as well as applying it.
Lookup tables are much faster but require memory. Things can be done to hybridise calculations. I.e process easy calls and lookup hard ones.. I.e. you don't need to process 2x3 AND 3x2 they are the same. . You also need to only calculate what is necessary and not everything at high resolution.. the trick is know when to round off and when to be more accurate..
I read somewhere after finding these new algorithms Alpha tensor applied these to itself and got faster!! I believe this is the next step in the world of AI where systems are not only able to figure out better algorithm but infact use them to become faster overtime
Sounds like the A.I. computer operating system that learns all operations from scratch, i.e. reading disk, loading memory, running programs, is almost here.
Maybe what’s next is an alphatensor powered assembly optimizer. Seems like it would be the same problem but way bigger. Given that people treat the optimizers like black boxes anyway it would be pretty incredible to have a significantly smarter optimizer just show up. Maybe it has different versions for different instruction sets and cache sizes. Maybe it has a pre build model for every consumer cpu. Squeezing every drop from your hardware with AI optimization is a pretty cool idea.
What comes next is "Physics Diffusion" where we can input all the elements and have AI create exotic material that exhibit negative space. Then boom flying cars 😀
Thanks. Great info! Absolutely dislike the continuous dramatization, though. It's like, "OMG, OMG, the zombie is catching up" adrenaline rush and grabbing the webcam every few seconds. Completely unnecessary. Sorry if this hurts your feelings. Not the intention. Really want to improve your channel. I subscribed!
No feelings hurt. I'm learning, and improving with every video. Everyone's feedback is valuable because it's my only way to make progress. Thanks for reaching out!
First time came to ur channel, salute to you sir, the efforts, research etc etc has made made this video awesome. Looking for your guidance. Thanks and regards
I am confused. Is an AI (whose DNA is matrices and their multiplication) now multiplying matrices as a problem to make them multiply faster? Does that mean it makes itself faster with time? Shouldn't that fall into an inception loop? By the way did you see the video of "analogue chips" by veritasium? It intends to speed up multiplication through voltage, current and resistance relationship sacrificing unnecessary precision that digital chips bring along (and sometimes not necessary to the final outcome in AI's decision making)
thank you, great video. Perhaps the next step is something more and more exciting, like a 'living' tool that creates and replace itself in more powerful and general tools each iteration, like living RL and Goodfellow's GANs derivatives, or like our brains... persistent focus is succesfull in nature, and birth-death cycles even more!
Imo the ending (“What comes next?“) nails it. There is some juicy potential for papers if you figure out how to turn other hard fundamental CS problems into games and apply Alphazero.
Hey, i think the editing style doesn’t fit for this kind of content (too fast/ dizzy) - i know the other comments liked it and it borrows from large youtubers but i feel like a bit less would be better (even if retention might drop a little).
Really appreciate your comment! Thanks for letting me know! I've been trying to find a good balance, but overall, I'm trying to make the type of content that I like to watch. For example, although RUclips is full of "academic" videos, I find them boring and don't learn too much from them. But I hear you! Everyone has a different style, and the secret is to find a good balance! I'll keep trying.
@@underfitted I completely agree with you. Your content reminded me a little bit of Jonny Harris's videos - very engaging, very creative and keeps the story alive. I'm very interested in learning more about science topics and AI, but everyone I come across on RUclips is so academic and monotone about it that it becomes impossible to watch at times. Two Minute Papers for instance is great, and I adore Károly Zsolnai-Fehér, but his monotonous voice does sometimes become a bit hard to listen to, and he also loses audience when he starts talking about technical data. Your content was a breath of fresh air, and you've earned yourself a sub! I look forward to seeing more content from you in the future :) My one recommendation for now would be to either soften or remove the volume of the clicks whenever the word changes on the screen - I found it to be slightly detracting from what you were saying. But otherwise it is really good stuff!
Joseph, this "... but everyone I come across on RUclips is so academic and monotone about it that it becomes impossible to watch at times" describes exactly the reason I started making these videos. There are fantastic creators on RUclips talking about AI/DS/ML, but I need something different. A few people have commented here because they find this style "weird" and "not serious" for ML content, and I agree :) But that's a feature, not a bug. I want to build something different.
The Editing is on fire 🔥 I also upload videos on ML on my channel and you are a true motivation for me to get up and make a new one. Your consistency inspires me.
I really admire the amount of work put into this video: the research, editing and everything else. Kudos.
Thanks, Rafael! Yeah, it’s a lot of work, but we’ll worth it if the videos help more people.
A.G.I Will be man's last invention
I always think Kudos sounds like what they call Cheetos in Greece for some reason.
Without having considered the counts themselves, I would have expected it to be a larger channel given the way the video is structured and how precise some edits are. But that aside, it's interesting how we went from finding neural patterns to solve tasks to finding neural patterns that optimize finding neural patterns to solve tasks. Essentially analogous to a factory creating better factories. I'm aware that in public topics such as AI are usually only mentioned in a shallow manner, referencing typical generic developments of poor face or speech neural networks, even providing merely mediocre samples of them additionally, whereas there are so many more intricate creations from the past 10 yrs. While the media is often seen as immediate, it depicts new types of technology as concepts of the future or reduces their depth, even though they already are defining the present. Maybe too metaphorical, I just think very few people are aware of the state of models and have an insight on what is to come soon.
Thanks for the comment! Really appreciate it! Yeah, we are making progress at a breakneck speed. I can’t even imagine where we will be by the end of 2023.
It used to be a manual process. When transistor count increased, one of the first thing computer engineers did was to put 10x more logic gates into the multiplication part of the ALU by using many predicates like filling in a truth table.
Having a known software method to optimize them further is nice, but be reminded that hardware made for the task is already super optimized with memory bandwidth and latency being the limit
Even with 3d stacked cache and large matrices?
I thought the point of this was that it could optimize on any hardware. Those optimizations can then be built into the next iteration of the hardware. I don't see this as a purely software optimization. Wait till the AI is capable of implementing those changes in hardware by itself...
When using truth tables as part of the digital design process, we get ridiculous logic gates as we add bits to the data buses we operate on. As far as I know, we do increase the speed of the custom circuitry, but differently, by adding more gates to the designs, but still I am not sure about perfect, being an engineer is realizing there is never a roof you can reach, I think.
it is not all about performance. if you optimize at the wrong place you get vulnerabilities which could ruin your whole business
The real breakthrough here is AI being able to improve the design of a crucial building block of its own implementation.
Always high quality, high entropy content. DeepMind's application of Reinforcement Learning (RL) to solve problems is fascinating, and I agree that it will be interesting to see how they use RL moving forward: what other problems will be "gamified" in the future ...
Right?! I'm tempted to say "the next decade will be interesting" but who am I kidding: 2023 will be really interesting!
Thanks for the comment! Appreciate your words!
@@underfittedmaybe an AI learns how to create AIs given a set of task-network
A.I. could (likely already is) gamify the "human farming game," what neither Marx or Rothschild, nor any other man has totally figured out to perfection.
The last 75+ years of Technocrats and Cyberneticians have indeed made tremendous documented progress in innovating improvements to the human farm; yet, just imagine what machine learning from and for us (and our digital twins) could bring forth from the human stock.
Then maybe, by the time we get elderly the brain to silicon tech will be ready for perfect virtual eternity with our tremendous A.I. pals. No belief in God needed or expected.
Your compliment was sufficient. Let's see Paul Allen's comment.
I want DeepMind to build an AI that takes as input a training set and labelled dataset, and as output gives you the best hyperparams to learn that data with the lowest loss and best generalisation error.
Jesus ai advancements in the last year have been insane. This is phenomenal
Yeah. When I look back at the list of things they happened this year alone, it’s crazy!
This is the video I’ve been looking for describing AlphaTensor. Concise and to the point about why this is important. Subbed.
Glad it was helpful! Thanks
Can AlphaTensor solve x265 decode on Playstation 3? There are no good compilers, even IBM had separate compilers for the PPE and the SPEs in their own CellBE CPU and nVidia kept the architecture details of the G70 confidential. Better yet, can AlphaTensor produce an LLVM backend for Playstation 3? Or optimise LLVM backends more generally?
Great questions :)
For non square matrix multiplications:
(axb)(bxc), the number of multiplications will be = a*b*c
Right
Its one of those things that are great in theory but mildly useful in practice. Something like 40% speedup was achieved by the paper authors.
I loved your hardware interpretation of the results because today IBM and basically everyone is trying to make better chips for matrix mul tasks from ASICs to Accelerators, and a new algorithm eliminates a huge compute cost 💯💯. Thanks a ton for the video, I too am working with matrices this moment.
Glad you liked it!
I am stunned, now subbed and shocked by how juicy and fun to watch && also insightful you made this video, you brought us through the whole journey in minutes and explained every obstacle.
On top of that, your visuals and audio is fantastic, extremely high quality content you produce.
Please more :)
Glad you enjoyed it! Thanks for taking the time and writing such a thoughtful comment. More to come!
@@underfitted Of course, this is the least I can give back! Hope to see more awesome ML content and also the awesome color grading ;)
So can we expect generalized version of optimal matrix multiplication?
Yes. The generalized version will eventually be: make the AI optimize it on the fly
Thank you. I heard about the article when it came out, but didn't realise the implications at a practical level. The machine improving its own hardware and software to get better. That's the path to singularity
Did it take into account finite precision and truncation issues?
Now, how about matrix inversion problem?
Your channel and your tweets are really helpful in my ML learning process.
The quality of your videos is really high and the content is incredible useful to better understand ML.
Vamos Santiago!
Marlon, thanks so much for your comments! Really appreciate them!
Totally agreed! And I always wondering why the traffic in YT seems much less than Twitter. These videos deserve more views!
Is stability of the algorithm not neglected for the big O ?
I think I understand the implication? We're talking about an algorithms ability to adapt, right?
We can see that in some neural networks that novel behavior appears after millions of millions of repetitions, which I guess is similar to adaptation. But a AI's ability to use different techniques on different problems as a fundamental feature versus a result of extensive training might be a big deal.
I really enjoyed the video. You kept me interested and whisked me along your train of thought with very little effort on my part. Perfect story telling, great explanation of all the necessary topics and you made it engaging. Awesome work.
I wonder if these multiplication optimizations are useable in GPGPU. Since GPGPU can calculate each element independently in parallel, keeping the operations independent is important. I don't know as much about these multiplication algorithms, but maybe they can be parallelized in their own way.
We already use GPUs (or TPUs) for all this. Alphatensor is computed with billions of f16 mat4 calcs on your gpu, and its results could be used to optimize the next iteration of hardware pipelines, of course?...(or perhaps, less usefully, emulated on the same device.) Why couldn't we bootstrap math and theory, its how these devices make improvements in the first place? Gpgpu, yeah. I don't really get what you're trying to say tbh.
Dude! Fantastic video. Full marks on making an exciting, clear, attention-grabbing and informative video! Honestly, the production was faultless.
Thanks, Jason! Much appreciated! Still learning a lot, but I'm glad the videos are coming out better.
Provocative presentation - instant subscriber.
Things you said that caught my attention:
1. Concisely stated the problem, i.e. computation time for matrix multiplication
2. Nice historical summary of approaches to solve the stated problem
3. Clear understanding of "small gains" may yield large rewards. Examples of applying this "algorithm optimization method": solving complex problems found in general relativity and fluid mechanics (non-linear differential tensor and vector systems) or standard quantum mechanics (or string/membrane theory)
4. Your enthusiastic imagination; unafraid to jump ahead to what might be next
Looking forward to what else grabs your attention
Thanks! Appreciate such a thoughtful comment!
In quantum computing, when we introduce noise to the system, the calculation of the density matrix for the state of the quantum system grow exponential. Maybe this will help a lot in that field.
and, can you share the algorithm? or should i keep multiplying matrices the naive way?
The paper shows the results for 4x4 by 4x4 and 4x5 by 5x5
Did the paper specify which method was found by Alphatensor for each matrix dimension? Did they make that information public or just the minimum number of multiplications found?
Their paper includes the specific steps found by AlphaTensor to multiply 2 4x4 matrices and 4x5 by 5x5. www.nature.com/articles/s41586-022-05172-4
0:27 : you cannot "solve this equation" because it IS NO EQUATION! lol 😂🤣😂🤣
You are right. Thanks for taking the time!
What a great video. You really know how to capture your viewers attention.
I appreciate that, Pedro! Thank you!
Great video, thanks for making it! I have a small nitpick, which you do consistently: x^2 - y^2 is not an equation, where's the equals sign? You can call it a polynomial, or say you want to compute that expression, but there is no equation there to solve.
💯 I should have said expression.
Where is the link to the paper?
There is also some neat work out there to make multiplication and things like sin, cos, tan much faster. Autonomous driving is a lot of 8x8 matrix multiplication and we have algorithm/logic to do this very specific thing. But it's a really hard engineering challange and tradeoff between big-complex-slower chips that offer matrix multiply and smaller-simpler-faster chips that only offer really basic math but stupidly fast. Which one is more efficient really comes down to the job of the hardware and the algorithm at hand.
It's limited by the fact that each algorithm it finds differs depending on the matric dimensions?
So I don't see a wide application of this for math, unless we stop each algorithm for each matric shape.
Interesting, yes but not ground breaking
For machine learning is very important. Finding a different algorithm depending on the dimensions of the matrices is more a feature than a bug. The goal here is speed, not generalizability.
The other aspect of it that I really like is the application of AlphaZero on a different domain than games. It opens many possibilities for us!
4:17 This is not quite true. There are algorithms with a better time complexity: O(n^2.4) vs. Strassen's O(n^2.8). The problem is that these theoretically faster algorithm only outperform Strassen for truly gigantic matrices, which is why they aren't used in practice.
What time stamp are you referring too? 4:17 is not the one, I think.
@@underfitted I'm referring to your quote "53 years later and we still don't know if there is a better way to multiply matrices." at roughly that moment in the video. I guess it depends on your definition of "better" but asymptotically faster matrix multiplication algorithms than Strassen have been discovered.
Quickly saw this is an awesome channel and quickly subscribed.
But I'm shocked it only has 12k subscribers.
You deserve a million.
Thanks for sharing.
I’m just starting. Thanks for the support!
i have read that AI hardware is going to move to analog signals, because you can do multiplication just passing a current through a resistance. The other problem of matrix multiplication is the accumulated error provoked by floating point representation in binary (digital) system.
I just subscribed. Keep on your fascinating works.
Thanks!
Haven't read the paper but I am curious how well the new algorithms are numerically. Some algorithms that are fast are also not practical to use due to their poor numerical stability.
Machine learning does not care about stability. They are even using 8bit int instead of 32bit or 16bit float to do the inference calculation to reduce the hardware needed. Going from 16bit training to 8bit inference have very poor effect on numerical stability.
An essential part of an equation is an equal sign which is missing in all of the items referred here as "equation".
To what extent can Alpha Tensor help with very large matrices? On very large matrices I got a 40% speed up simply using cache coherent operation ordering.
Gee, back in the mid 1970's (high school), before we all had home computers, I gave myself a project of reducing the number of math operations required to solve large multiplication problems. Came up with some reductions, but did not have much use for it at the time. Now I'm trying to remember just what I did. Question: In general, does AI come up with solutions because it tries all the possibilities, eliminating those processes that apparently don't work, while optimizing the procedures that show some promise? For example if the problem was to come up with sets of 3 positive biquadrates that sum to another biquadrate, would AI just test combinations of numbers or would it actually develop a parametric equation if possible?
Very similar process to what you described, yes.
And no, this hasn't made any significant dent on the problem of multiplication. Even in their own hand picked selected few problem the gain is what like 6 to 8 percent. Thats not ground breaking. One gets that speed up or down just in the cache access.
We’ll have to disagree on that one. 😊
@@underfitted so how much was the speed up?
You should read the paper.
@@underfitted i read it thatd why i said. The max improvement that is reported is like 18 percent.
Thats not ground breaking. Not even close.
I have a patent with my company where i take search in 1 dimension array and have it searched in o(3)to o(4) time regardless of data size.(once the data structure has to be constructed). The same thing usually is done with binary search that is size dependent.
This is called improvement. Not 10 to 20 percent cost improvement.
I would not even count this as anything if it doesn't really give good advantage.
Language is such a funny thing.
I like how the video title rhymes with "Attention is All You Need" paper on transformer
Yeah, I did that on purpose :)
I read that paper, a while ago....and Inwas pretty amazed with what is written in it.
Yeah, it's really good.
Aside from the (mis)use of "equation" which you must have got for the millionth time by now, can you check your expression for M6 in the case of the 2 by 2 matrix? I think it should be a negative sign in the first pair of brackets in M6 just like M7, otherwise you don't get to calculate the ab_22 element in the resulting matrix.
Also, showing how to combine M1 to M7 to get the 4 elements might be good (as much as one who is interested can simply look it up and verify from Wikipedia).
Yup, 2,000,000 comments mentioning that I said "equation" instead of "expression." :)
Another thought ...
Accuracy is improved with fewer calculation steps as well as speed; in some instances this gain might be even more important
You deserve to be on the level of 3blue2brown or computerphile. You're amazing!
Thanks!
I wonder how much improvement we can get if we combine alpha tensor with analog computer chip like one created by Mystic AI.
Lemme just say...
Sir you are amazing🤩
The hardwork and information with atmost detail with creative editing makes this video very very engaging and connecting.
Hats 👑⛑👒🎩 off to your effort.
Subscribed right away. 😀😄
Thank you so much 😀! Really appreciate your comment!
Waited such video but did google published the results they got?
They publish their paper on Nature.
The conclusion seems very similar to what Stephen Wolfram says in a New Kind of science: start searching in the whole set of simple programs for the one that fits the behavior you are looking for. Boring task -> give it to a program.
This is terrifying in some way. A machine that is so 'clever', that can write code itself.
It'll happen.
The cell is so clever it can evolve to an embryo. I'd call it amazing.
We actually have this with generic algorithm and some strategies of system identification in engineering already.
It's just finding the right question and initial data as well as proving your results is often times harder than solving the problem in the first place. But this approach really helps with discovering math and physics as well as applying it.
Lookup tables are much faster but require memory. Things can be done to hybridise calculations. I.e process easy calls and lookup hard ones.. I.e. you don't need to process 2x3 AND 3x2 they are the same. . You also need to only calculate what is necessary and not everything at high resolution.. the trick is know when to round off and when to be more accurate..
His hands are shaking from excitment.
Truly a maths genuis.
I read somewhere after finding these new algorithms Alpha tensor applied these to itself and got faster!!
I believe this is the next step in the world of AI where systems are not only able to figure out better algorithm but infact use them to become faster overtime
Sounds like the A.I. computer operating system that learns all operations from scratch, i.e. reading disk, loading memory, running programs, is almost here.
Would the system be able to optimize its own algorithms? Would it be able to find better algorithms this way to improve itself recursively?
Not in its current version, no
Just subscribed and started to binge-watch all the videos. Channel is brand new, I hope it grows fast :)
Thanks for the support! Yeah, channel is 114 days old 😋
Maybe what’s next is an alphatensor powered assembly optimizer. Seems like it would be the same problem but way bigger. Given that people treat the optimizers like black boxes anyway it would be pretty incredible to have a significantly smarter optimizer just show up. Maybe it has different versions for different instruction sets and cache sizes. Maybe it has a pre build model for every consumer cpu. Squeezing every drop from your hardware with AI optimization is a pretty cool idea.
You can multiple matrix by recursions.its the same operation time but you can try loan results from smaller dimensions matrix's.
There are many ways to skin a cat.
Do you think P vs NP can be solved?
I don’t think P = NP. The other question is whether we can prove that. I’m not sure about that one.
This was brilliant, props to you!
Thanks!
Dude! Fantastic video. I enjoyed watching it 😮 You are awesome 🌟
Thanks!
That is a scary concept. AIs finding better ways to code AIs. Great video.
Always brilliant Santiago and you choose your topics verl well
your content is very helpful and the quality of your videos is astonishing
Thanks, man!
its not an equation , its an expression , equation is when there is an "="
You are correct.
0:30 A simple "equation" indeed, it even lacks an equality! 😄
Yup. I should have said "expression."
Very good video! And I had already read the paper and blog post!
Thanks a lot for the comment! Really appreciate it!
Great video, I got the chills thinking about how crazy this actually is
I’m with you!
So a bunch of matrices finds a new way to mutiply a bunch of matrices. Very Cool.
Thank you for this informative video
Glad it was helpful!
What comes next is "Physics Diffusion" where we can input all the elements and have AI create exotic material that exhibit negative space. Then boom flying cars 😀
Literally the machines are training us to make them smarter.
U deserve more subs... Your research and editing is fabulous.
Thanks so much!
Wonderful video: short enough, clearly explained, enthusiastic.
You've earned a subscriber!
Thanks!
Appreciate you for compiling such informational content. 😊
You bet!
This means new possibilities. I think it Will also optimize memory management. Because a lot of time can be spent only on waiting for the data. Thanks
Thanks. Great info! Absolutely dislike the continuous dramatization, though. It's like, "OMG, OMG, the zombie is catching up" adrenaline rush and grabbing the webcam every few seconds. Completely unnecessary. Sorry if this hurts your feelings. Not the intention. Really want to improve your channel. I subscribed!
No feelings hurt. I'm learning, and improving with every video. Everyone's feedback is valuable because it's my only way to make progress. Thanks for reaching out!
First time came to ur channel, salute to you sir, the efforts, research etc etc has made made this video awesome.
Looking for your guidance.
Thanks and regards
Thanks!
Next it also designs and optimises the hardware. Bet there is a way to drop a heap of transistors from every operation too
I don't doubt our ability to do anything, to be honest.
Amazing video, great editing, interesting topic. You got a new subscriber!
Awesome, thank you! Really appreciate it!
This man actually replies to (almost) *every* comment. Now that's dedication.
I try to. As long as RUclips shows them to me 😁
I am confused. Is an AI (whose DNA is matrices and their multiplication) now multiplying matrices as a problem to make them multiply faster? Does that mean it makes itself faster with time?
Shouldn't that fall into an inception loop?
By the way did you see the video of "analogue chips" by veritasium? It intends to speed up multiplication through voltage, current and resistance relationship sacrificing unnecessary precision that digital chips bring along (and sometimes not necessary to the final outcome in AI's decision making)
2:22 Indeed, instead of going 2 times 4, we can do 1+1+1... 8 times, that's one less multiplication needed 😂
thank you, great video. Perhaps the next step is something more and more exciting, like a 'living' tool that creates and replace itself in more powerful and general tools each iteration, like living RL and Goodfellow's GANs derivatives, or like our brains... persistent focus is succesfull in nature, and birth-death cycles even more!
I can’t wait for 2023!
Great Video great Graphics and Information Please continue making such videos
Thanks! Really appreciate the comment!
Absolutely amazing video!
Thank you!
instant sub, love the content, just hoped the energy is 10% lower
Man… I wish I had a regulator 😋 thanks for the sub and the comment! I’ll keep improving!
Wow, this was really interesting and the way you delivered it made it even more fascinating! You've got a new subscriber :)
Thanks!
0:34 this is not an "equation"
You’re right
Very interesting topic and very good presentation. Kudos
Thanks!
Hi, The Content and Editing is in Top Notch. After ML Concepts, Continue with Deep Learning & Like this Breakdown too.
Thanks a ton! Really appreciate your comment!
Pretty good editing skills.
I fucking love this video, a cool mix between math and computer science! Well done
Thanks, Koen!
The tick sounds effects during the work switches became very distracting.
You are right. I got rid of it on my last few videos. Thanks for the feedback!
Imo the ending (“What comes next?“) nails it. There is some juicy potential for papers if you figure out how to turn other hard fundamental CS problems into games and apply Alphazero.
Right on!
Awesome content!!
But multiples and adds take only one clock cycle on many CPUs now.
What a wonderful an engaging way to explain the complexity in simple terms, thank you for all the hours and hearth you put in your content.
Thanks, Carlos!
Hey, i think the editing style doesn’t fit for this kind of content (too fast/ dizzy) - i know the other comments liked it and it borrows from large youtubers but i feel like a bit less would be better (even if retention might drop a little).
Really appreciate your comment! Thanks for letting me know!
I've been trying to find a good balance, but overall, I'm trying to make the type of content that I like to watch. For example, although RUclips is full of "academic" videos, I find them boring and don't learn too much from them.
But I hear you! Everyone has a different style, and the secret is to find a good balance! I'll keep trying.
@@underfitted I completely agree with you. Your content reminded me a little bit of Jonny Harris's videos - very engaging, very creative and keeps the story alive.
I'm very interested in learning more about science topics and AI, but everyone I come across on RUclips is so academic and monotone about it that it becomes impossible to watch at times.
Two Minute Papers for instance is great, and I adore Károly Zsolnai-Fehér, but his monotonous voice does sometimes become a bit hard to listen to, and he also loses audience when he starts talking about technical data.
Your content was a breath of fresh air, and you've earned yourself a sub! I look forward to seeing more content from you in the future :)
My one recommendation for now would be to either soften or remove the volume of the clicks whenever the word changes on the screen - I found it to be slightly detracting from what you were saying.
But otherwise it is really good stuff!
Joseph, this "... but everyone I come across on RUclips is so academic and monotone about it that it becomes impossible to watch at times" describes exactly the reason I started making these videos.
There are fantastic creators on RUclips talking about AI/DS/ML, but I need something different.
A few people have commented here because they find this style "weird" and "not serious" for ML content, and I agree :) But that's a feature, not a bug.
I want to build something different.
First of all, x²-y² is not an equation.
You are right. What’s second of all?
so clearly explained, please do more on science, math related topics
Thanks! Will do!
The Editing is on fire 🔥
I also upload videos on ML on my channel and you are a true motivation for me to get up and make a new one. Your consistency inspires me.
Thanks, Pritish! I'm so happy you liked the editing. It took quite a bit of work, but I'm happy how it turned out.
Keep going at it!