DeepMind’s New AI Remembers 10,000,000 Tokens!
HTML-код
- Опубликовано: 28 сен 2024
- ❤️ Check out Microsoft Azure AI and try it out for free:
azure.microsof...
📝 The paper "Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context" is available here:
storage.google...
📝 The paper "Gemma: Open Models Based on Gemini Research and Technology" is available here:
storage.google...
Try Gemma:
huggingface.co...
I would like to send a big thank you to Google DeepMind for providing access to Gemini 1.5 Pro to test it out.
Sources:
/ 1760468624706351383
/ 1761113846520131816
simonwillison....
/ 1761459057641009354
📝 My paper on simulations that look almost like reality is available for free here:
rdcu.be/cWPfD
Or this is the orig. Nature Physics link with clickable citations:
www.nature.com...
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: / twominutepapers
Thumbnail background design: Felícia Zsolnai-Fehér - felicia.hu
Károly Zsolnai-Fehér's research works: cg.tuwien.ac.a...
Twitter: / twominutepapers
#deepmind - Наука
What a tiiiiiimeeeee to be aliiiiiiiiiiiiive!
What a time to be alive!
It gets eeeeven better
Quelle époque où vivre !
(Wait ... It sounds better in English...)
What a time to be alive!
Really what a good time to be alive!
I love how the goals for this technology are so high that we get disappointed when it takes 1.5 hours for an ai to remember everything in 10 movies
Right 😂 ?!? Societies advancement and the competitive systems we live in ask for it, unfortunately.
In 5 years these numbers will sound ancient. An AI will probably be able to watch a movie in a second.
@@krishmavPretty hopeful estimation there, the speed of AI is largely limited by the hardware it runs on, and it's highly unlikely that hardware will be hundreds of times faster within 5 years.
True
@@mgord9518 The newest Nvidia chip announced is already 100* times better than consumer hardware.
it's currently for server's
I wish for someone to feed an AI with heaps of dolphin or whale singing. Maybe we can actually find a way to understand what they communicate and to maybe talk back to them
there is actually a project going on right now that is attempting to do exactly that - use AI to try to talk to whales 🙂
Check out Whale-SETI! We actually started doing this!
That wouldn't work, you'd would also need to feed it context of what the creature is doing, and no amount of words can define all the nuance and possible interpretations, so the AI will only be able to learn a simplified version of their basic emotions like fear, happiness, or grief.
For example, hypothetically let's say dolphins remarkably have unique names and call eachother by their names. A human might observe this behavior and describe it as a generic greeting not knowing dolphins have names. The ai will see that dolphins make random sounds during greetings and assume dolphins greet by making random excitable noise. The only way to actually discover this is through targeted tests, it's not something that can be spontaneous discovered in existing data.
@@dawiedekabouter5733how ya gonna do an fMRI on a sea creature 😂 let alone a whale
@@iminumst7827 The way they are doing it is noting down that are happening around the whales at the time, then putting that alongside the communication between them.
Wtf? It takes me 20 hours to watch 10 movies and then another 10 hours to write an essay about them. You telling me this AI can do it all in only 1,5 hours??? How is that not an awesome thing?
i think the issue is that having watched 10 movies EVERYTHING it does is now slow, even those unrelated to those videos. so it's like you watched 10 videos and now you can't hold a conversation since it takes you 1.5 hours to process anything anyone tells you.
You can write a whole essay in 1 hour? That's incredible.
i think the issue is the response to each prompt would take that 10x
@@tuseroni6085 is that really how it works?
@@mikopiko What do you mean?
I've said for awhile now that this technology should be used in a massive global rare-languages rescue effort.
and solving aging and cancer.
In an interview Ilya chief scientist of OpenAI explained with certainty how to solve climate change with carbon capture. His certainty about the best method made me think that he'd already asked a superior AI for the solution.
Nice! Now we can build a Solo Leveling AR game with AI :)
Maybe wearing a smart glass to capture the exercises.
Did you already covered "Suno", the AI music generator?
I tried it and it's good
He did cover that tech like a year ago before it was commercial.
What a time to use AI!
Your voice is weird in this video, is it ai generated?
you'd think mega context inputs would be executive AI reformulated into various higher level abstractions like some sort of linguistic computer code context representation which modularises language content to functions, structures, libraries etc so that vast bulk of tokens aren't even needed and are replaced with compact meta token instantiations with even higher latent space salience within the newly factored smart context
Joint embedding predictive architecture aims to do that
Looking at the problem of processing time with longer context, what do you think about Mamba as a solution for a subquadratic architecture for future LLMs? Maybe you could make a video about Jamba in the future.
What is all the mumbo jumbo about?
Maybe use them to make the AI do some samba
@@Faizan29353 I could not find a fireship video about mamba. Did you mean bycloud?
"aaand"x100000000
In the future AI will be able to consume all the media of a certain franchise (eg. Marvel) so it can help in building the Wiki of that franchise. You’ll also be able to ask it extremely specific obscure questions that even the most diehard fans and prolific Wiki contributors wouldn’t know.
this is already possible
Mamba long context research is the future.
Just run them in parallel to get around the 100x quadratic issud
I want to use AI to help my readers ask questions about my books (What I will be writing). And also to help me to avoid continuity issues. And to help with editing my book.
I'm unsure about context levels of AI... I used the GPT 4.5 API with 128k context window and it did never follow any of my prompt and basically just summarized the book or video transcript I gave to it. Basically being unusable...
I love the "aaaaand" so so much ; )
ChatGPT must be having a hard life.
Imagine having a new brother or sister everyday...
Haha savage
No, it didn't give me a warm and tingling feeling.
IT STOLE MY DAMN JOB...
1 bit LLM Architecture and AI inference cards will probably reduce the inference time by orders of magnitute, see Groq.
I'm running out of imagination what we will be able to do two papers down the line...
Can you imagine one integral LLM, that holds also all sources of academic papers of all kinds, so you have not only the data, but cientificial data? Idk if im explaining right, sorry for the bad english.
You mean like scientific analyses and results?
@@loopingdope yeah, I don´t know if the state of the art can have their research in these academic hubs
still scared me alot instead of watching straight 10 movies it only do in 1hr !
Instead of trying to remember everything, what if it continually summarised the key points of everything it has seen?
That way its memory would be more efficient at the cost of accuracy.
Artificial intelligence twin of Karoly zsolnai feher is narrating the video? 😮
I have never used any of them - every single episode is me flipping out behind the microphone. 😀
@@TwoMinutePapersThey are able to distill data from NNs ruclips.net/video/fk2r8y5TfNY/видео.html What's your opinion on that?
Please, stop with the extremely exaggerated, staccato voice modulation, it's way too much, I can't stand it any longer.
congrats on the new sponsor mister 2 minutes
What is the meaning of life and everything
Oh just give me a million years to think about that
Oh great machine what is the meaning of life and everything
42
claude is still not available in my region :(
mamba and rwkv is not quadratic i think, it's not based on transformers
All AI can't beat math puzzle 24
I miss when your videos were about technical papers about simulations and AI explained for ignorant people like me. Now this channel looks like a quick news portal about chat bots just like any other that can be found on RUclips
I bet that will be easy to jailbreak.
The more tokens, the easier it is to jailbreak.
This is all very exciting, and I've been on the AI hype train for a while, but I saw a video recently from The Hated One that claimed that AI uses an unsustainable amount of water. Thoughts?
If it doesn't require entire farm to keep up, like some 512Gb ram, and 100Tb of rom, and therefore a subscription of $15 per month to pay the bills, then it's actually cool.
It's actually 512 of VRAM ... 512 of ram would be kind of ok
@@apoage it's not true, VRAM only required by those drawing neural networks like Midjourney or SD, but language models do require RAM
Were they using it to lift weights... and biases? 🤭😁
Could the advancement with the GPTs help solve the Voynich manuscript?
it might, although ...why
No.
@@bluthammer1442 To extract the contents of the book.
There's no meaning behind it. It's just horseshit.
@@EricDMMiller How'd you draw that conclusion?
Most of the world is operating at a pace and level of progress we were operating on decades ago. Now, people diving into ai are progressing faster than ever. Do we call this spaghettification of society?
3:45 so exactly how is that not practical?
a human would take weeks to summarize the contents of 10 movies, after all.
or was that just a poor example?
Train an AI to understand and translate hieroglyphics pleaseeee
I don't remember what I ate for breakfast
That's also my impression that Claude 3 is superior at coding.
What about project Nimbus
WHAT A TIME TO BE ALIVE!!! Timestamp: ruclips.net/video/Z_EliVUkuFA/видео.htmlsi=XO0lmlWhieGHi20T&t=339
good job
I know it's a small thing and I don't mean to be rude but the thumbnails with the old man in a graduation outfit doing an :O thumbnail face is kinda disturbing. I don't know why
Why don't use ai for making video for ai research it will save a lot time just fine tune a model with millions token of data. The ai would be able to write the script in your way of writing script for video without finding any difference between real style of the writer and ai.
cool
I await the day where there will be a video about Tesla robotaxi 😊
Bro I’m not excited for this change
MS promotes??? I'm a little saddened.
bro please get rid of the intro, it's always way out of place
claude 3 is the best coder. gemini is the laziest dumb model. yes it can remember better but who cares it is not giving the best result for my usescases.
it is just as lazy as gpt4 maybe even more sometimes.
It's also woke garbage. If I want to be brainwashed by Marxists, I'll just watch a recent Disney movie.
Please drop the staccato speaking so I can finish watching one of your videos without being annoyed.
The content always keeps me coming back, but his way of speaking really gives me anxiety. I really hope he works on it.
who actually give a damn about LLM in 2024?
nothing has been exciting since 2020.
Gemini > ChatGPT
Did they fix the black Vikings thing?
No way I'm using that woke Marxist AI crap. Google is infiltrated to the core, nothing can fix that level of indoctrination.
I'd like to see LLMs have a go at learning extinct languages with great historical significance like Sumerian and Akkadian
I personally am already tired of the eversame AI generated thumbnails.
perhaps the solution to the quadratic complexity is to implement a short term-long term memory system with an artificial hippocampus to help it remember.
I'll get right on it! :)
Yes people have been saying this for years
Yes, and no.
Sometimes it is good to forget.
Our brains have evolved to forget.
I suspect there is a better balance.
Never forgetting would be a massive burden.
So yes, short term long term, but not everything.
It is good to let go without even trying. ;)
Take care,
Jeremy
There have been a variety of approaches that improve on this around for quite a while.
@@Jeremy-Ainossa memória é baseada e sentimentos, acontecimentos intensos são mais fáceis de lembrar, mas inteligência artificial não tem sentimentos, então talvez criar isso seja o próximo grande passo para aumentar a inteligência delas. Isso supondo que o objetivo seja criar vida artificial consciente e não apenas ferramentas mais precisas
"What a time to be alive!" on 10 years we'll start saying "What a time to be dead!"
“When you realize ‘Two Minute Papers’ has more plot twists than your favorite TV show, and all it took was a couple of minutes and some groundbreaking science.” 📜✨
imagine the plot twist only two papers down the line😅
Most of it doesn't surprise me because I'm intelligent & optimistic enough to expect most of it, I just watch 2 Minute Papers to find out what currently exists & exactly what it's like.
its about time to feed such an AI with all neccessary books and data about transformer NN and make it make itself, and better
ngl i am literally waiting till this guy gets replaced by ai
What a tiiiiiimeeeee to be aliiiiiiiiiiiiive!
I like that 'what a time to be alive' line poping up in all your videos ...indeed it is.
The self attention of 10 very different movies though could easily be non quadratic. Why would the movies Oppenheimer and Barbie need to cross reference another in the self attention layer?
Okay the details are more complex but the basic idea of seperation holds.
i would say its just the problem with the current arquitectures, i guess something will come after transformers that is better and more efficient and could even solve this problem
If you didn't want them cross referenced then you wouldn't input both movies right? You would only input 10 movies at the same time if you want all 10 movies to be considered in your next prompts/responses for whatever reason that may be. If you just want to interact with one movie then you just input one movie.
Do you think you'll go back to talking about the developments in simulations and light transport? It seems almost every video is about generative AI now
im positive your voice is ai. no way you talk like that. im expecting any day a video like "i fooled you for more than a year"
butim not falling for it
Wow. Thank you.
Kind of a lot you got wrong here or didnt explain, when you say one movie, it was a old movie that runs at few fps, it cant take much video at a reasonable fps, i think it was around 10 mins, and it can't speak the language as good as a native speaker, it can translate it as good as a human that had the same amount of info, a translation book, i think these are pretty important differences and that is just what i can remember off the top of my head.
even if its an old movie with fewer fps, still lasts at least 1 hour, so for it to be 10 minutes you would need to have 6 times more fps on modern movies. Modern movies have 24 fps so for it to be 6 times greater than old movies it would mean that those old movies have 6 times less fps than 24, so 4 fps. Im sorry but i dont recall any moment on the history of cinema that movies used to be 4 frames per second, its not a movie its a powerpoint. I believe you when you say they had less fps, but it could be as little as 12 fps, and it probably was more than 12, maybe 16 or something like that, so were not looking at 1 hour vs 10 minutes, were looking at 1 hour vs 40 minutes more or less, and it probably wasnt an hour film, it would be 1.5 hours or something like that, so it would be 1.5 hours vs 1 hour
@@alvaroluffy1 many RUclips videos are 60fps and that is mostly what this will be used for, not low fps silent movies, I think it is important to talk about actual use cases, not just hype it up under perfect conditions. So this 44 min video would translate to slightly over 10 min.
@@countofst.germain6417 he was talking in terms of movies, 10 movies is not 10 hour-long youtube videos, you are the only one that made that wrong assumption, nobody is talking or understanding this video in those terms, he is very clear about it
@@alvaroluffy1 as I said, he should be talking about actual use cases and not hyping it up under perfect conditions, there very little reason to use this on movies. Also my point about the translation is completely valid, as I said I was doing this from memory, this fool was researching this for a video, and made lots of mistakes.
@@countofst.germain6417 first, i didnt talk about the translation, but he puts a text saying literally that about someone with the same info from the book. Second, if you cant look at the information provided to you in the terms that is provided, then this is not your channel to watch. If you understand 10 movies as 10 hour-long youtube videos, then stop watching this cannel, or stop making those assumptions, and finally, you're talking like youtube videos are going to be the main use case, but first you have no idea, because no one has any idea, and even if it were true, there will be tens of use cases, this is not going to be a one thing that uses videos from youtube and thats all, you're the fool for making all those assumptions and projecting them into him, if you can't see things clearly, if you are like media that cant stop lying and exaggerating the scientific texts they read, then stop consuming this content, because you are going to constantly misinterpret it and then spread disinformation wherever you go, so just stop being a fool
Whenever he says "and" 😂
You deserve every watcher and all the praise in the world. From video 1 the content is concise and well put together. One of the better channels hands down.
3:47 saying this feels either correct or flying way too close to the sun. we were used to Jukebox taking four hours to generate audio that ultimately might not sound great, or even have any audible sounds in it, for years...
Was thinking : 1 movie takes me at least 2hours to watch. AI can watch 10 movies in less time!
If Anyone wondering ( What a time to be alive ) will be said at 5:40
Great video!
I don't think quadratic complexity is actually a problem. It only is a problem where you want every token to "talk" to every other token. But what we really want is for all the input tokens (e.g. the input documents/movies/etc) to talk to the output tokens (the model output). Then assume each input token only needs to read the previous K tokens for understanding. For N input tokens and M output tokens thats O(KN) + O(NM) time, way less than O(N^2)
But that requires a new architecture, meaning it's no longer a transformer.
A million chances to get it wrong
yo! can we feed it the Voynich Manuscript to see if it can translate it?
Not sure why you'd want to. It's not like the Manuscript contains the theory of everything or how to beat cancer or something like that.
@@feynstein1004 you don't know why anyone would want to translate the most mysterious document we've ever found? there are images of plants unknown to science! and the text's characters and language have eluded every attempt at translation. Lots of people want to know what it says even if it's a just a cough medicine recipe!
@@PandemoniumLord Lol you're too easily swayed, my friend. I could write some random gibberish right now and convince people it's a mysterious document with magic powers.
Anyway, the manuscript is hundreds of years old. Even if it wasn't a joke, what useful information could it possibly contain that we don't already know by now?
@@feynstein1004 what do you mean too easily swayed? I’d be satisfied even if the text is mundane. You can’t dispute though that successful translation of the voynich manuscript would be a great accomplishment for AI, and could serve as a benchmark for future AI research.
@@PandemoniumLord That's exactly what I'm disputing. It's just some random piece of information. I think there are better things to spend your brainpower on 😉
I tried Gemini 1.5 Pro and it was very underwhelming. It was hyped to be the new cutting edge multimodal AI, but it falls behind Claude 3 Opus and GPT 4 in a lot of areas, plus it is so slow, its 1M context window becomes almost unusable
Honestly, learning kalamang is an impressive feat for an AI. Can't wait for it to be able to talk to aliens (if they teach it), animals. Imagine the future where you can talk with your dog..... and then it talks back to you. It will be like another wife 😆🤣
How come AI apps creaters don't merge with 3d game design with AI generators . You would finally have the stability of 3d character design that looks the same from every camera angle that are easy to pose like a 3d model , mixed the the speed of AI generated scenes and lighting .
Google might have more tokens and bla bla, but it is just straight stupid even compared with GPT 3.5 for simple tasks
I forsee that ultimate power over humans will come after a system has unlimited token window and doesn't treat each instance of communication with us as individual but is constantly aware of all of our inputs, interactions and fine-tunes itself on our questions and answers while feeding us single instance resposnes.😅
i cant even remember 10 000 000 tokens WOW, (kindof a joke but i wouldnt be surprised. memory isnt great.)
1. Wake me up when ByteMamba works.
2. Is weights&biases officially worse than Azure now?
My whole problem with AI right now is the inherent bias to it. It can be the smartest person in the room but it's being forced to mislead. And that isn't something I can get excited about just yet.
if you actually did more research you wiuld know that quadratic complexity in the attention mechanism has been solved for a few months now
That long context window will be a got sent for congress oposing sides when they are pushing thousand's of pages off some bill to be voted in overnight 😅😅😅
does something ground breaking
takes 90 minu-DONT WANT IT
This started as a cool 3D channel and now its generic AI articles from Nvidia and Google.
I have to unsub, I would love the old channel to come back.
he just did blender like yesterday? it's just that physics simulation stuff and ai are closely related
feed it the Voynich Manuscript
Not available in the UK or EU. 😭
Press F for λ labs.
token limitation is by far the worst thing about the local AIs i have. they most only remember up to 4096 and like a third of that is taken by setup prompts. i'd be happy with 50k tokens but i really wish they could just retain memory forever, even if vague like we do.
Thanks!
5:40 ^^
Hopefully we can get SSM based architectures that have roughly linear token scaling during inference to be as big as Gemini and the like. Perhaps a greedy MoE approach with transformers for short range context and SSM or Mamba for long range.