The problem with biological neural networks is that they take 9 months to build and about 20 years to train. Some of them don’t work, others are corrupt, they are highly susceptible to viruses and about half of them don’t follow logic.
You're just not good at coding. Where I'm from, you can already use them even in factories after only a few years of training. It makes them highly replaceable once the hardware starts to break down.
@@user-cg7gd5pw5b Thanks to your country. I wouldn't have my smartphone if it weren't for the replaceable assembly hardware driven by biological Nueral Networks.
Yeah this simply is not true. Perhaps its true for humans... but not even really. There is tons of stuff other than brain development that is going on and there is only a couple short periods of actual growth and major change that takes place in the human during that 9 months and 20 years timeframes. Think more like... A bug or maybe a Mouse as mice have been showed to easily solve many problems. If we could speak with some of these lower minds we would find they are far more capable than we thought. For instance, our current best NN are on par with like... Worms or something. We are speaking with chatbots that have slightly more power than a worm brain and calling it AGI in some instances. IDK, just something to keep in mind when contemplating this stuff.
re-pasting this because it was deleted : 400tflop for 1000 tokens; a decent gpu is 20 tflop/s, therefore it would take like 30s for a gpu to process it. A gpu is like 200w, 30s of which is 1.6 wh, not 300wh
I was going to comment this exact same thing. It is inconceivable it takes anywhere close to 300wh to run a single query against a model like GPT-4. Yes, training takes an inordinate amount of energy but a single prompt on said model takes only a tiny fraction of the 300wh stated figure (couldn't say what the actual figure is without doing some calculations but I would expect it to be closer to 3wh than 300wh). Otherwise, a great video though!
Why are you assuming all queries are 1000 tokens? GPT-4o's context window is 128,000 tokens and outputs up to 2048 tokens. Initial queries will be cheaper like you show, but the real power of using a model like this is in conversation or processing long texts. It works with around 100x as much data as your calculation. Even if it scaled linearly, the difference between your estimate and the video estimate lines up with that.
300wh for an inference request / compute? I run several models locally on an Nvidia 4080, it takes a few seconds of the GPU working at around 85%. I play games for 3 hours straight with the GPU at 95%, so that would imply I could charge 129,600 Iphones. My electricity provider would inform me of this, surely.
I'm thinking your local request is far more efficient than the estimated 300w. If cloud Ai is really eating that much power, I don't see how they survive open source local models. I've replaced GPT-4's API calls with my own local Llama3 with very good results. No more fees an nothing leaves my network too, which is a thing customers love to hear
@@MrAnt1V1rus are you even training the model, if no. How are you running something for a client when you don't know how it works or what data it's going to output
*THE BEGINNING is a bit misleading* because it is not the use of the neural networks, but exclusively training of them is power-hungry. Once the training is done, the power costs are minimal in most cases. Hence, even iPhones use onboard hardware accelerated neural networks for many years by now. Also, before spiking and neuromorphing, the biggest breakthrough that is actually possible and is happening is activation function training, which postulates that it should not be just a sigmoid or any other simple function but a parametric curve. The chap has forgot to tell about this. But otherwise, the video is great.
@@Darthvanger SD is downloadable for PC, but not for e.g. iPad. But if it was, it would run locally much faster due to the ML acceleration. BTW, both AMD and Intel are now going the Apple way to incorporate the ML cores, so future generations of CPUs from PC will run SD much faster. Of course, heavy models like SD run much faster on servers, but people are still using them locally on some smaller, shorter tasks.
Neuromorphic is the future of AI. It’s something i said to myself the first time i learned about it back in 2021. I’ve always been obsessed with the concept of AGI and the benefits it can bring (and how to mitigate the risk)! To me, it always seemed like our conventional hardware and software wasn’t going to work in the long-term. An interesting thought i’ve always considered, is that when AGI is eventually created (could be as little as a year away), said AGI system could be integrated into advanced simulation software. Imagine an AGI system using simulations to discover and determine a room-temp superconducting material! Imagine an AGI system that troubleshoots 100 years of nuclear fusion reactor designs in less than a second! Imagine an AGI system that designs a compact quantum computer that’s lightyears ahead of our current tech, integrates it into itself, and then makes a 1-1 scale model of a human down to the molecules for illness prevention and understanding! We are encroaching upon the most important time in human history. If we can stop ourselves from weaponizing AI long enough, we may see the creation of an aligned ASI that’s happy to give its parents a nice retirement!❤
8:14 - "I feel alone..." So very relatable. You'll always get the line "you're not alone in feeling this way" in one form or another. But in looking around, there's no one here. And there won't be anyone here. So I must be alone, and I should feel like this. Chances are, most people who are lonely enough to ask help from a speech model won't find anything new they haven't heard from a counselor, so they'll stay alone too. I find no comfort in that. For the average human happiness, it would be so much better if it was just me.
Homie I don't wanna be that guy, but perhaps telling people that problem in the comment section of a RUclips video is not a good solution to that problem. you should go meet people in real life, use electronics less, enjoy nature, and lift weights, run, or do some other physical activity. Find a group that does things you like, then hang out with that group so you can form relationships with people. If you are intentional about it, you will find people with whom you may have deep and real friendships with, but it will be up to you to make it happen. No one will make you go out and do that other than you. Easy places to form friendships are gyms and physical competitions (or competitions for any hobby, be it robotics, debate, or book clubs), churches, and volunteer programs that serve communities. Hope you figure it out.
A lot of people are complaining about the 300Wh figure, but they seem to be justifying it based on running small local models rather than industry-scale applications. I'm going to assume you're using the same source I found, in which someone tried to calculate a request's cost by using Sam Altman's claim of a worst-case cost of $0.09/query and power costing $0.15/kWh of power (they were using Euros but that doesn't make much difference or sense for this really). For some reason, they threw in an assumption that this estimate was double the actual cost, arriving at 300Wh. (I'm guessing that they decided halving the cost was a reasonable estimate instead of just using the worst-case estimate.) There's not much issue with this estimates.. but it is over a year out of date and it's not clear which model is specifically referred to. There is no published parameter count for GPT-3.5-Turbo or ChatGPT, but multiple estimates have placed it in the range of 20-22 billion. GPT-3 has 175 billion parameters. GPT-4 has been said to contain 1 trillion or 175 trillion parameters. GPT-4o hasn't had anything about parameters mentioned. Since there seems to be a lot of assuming that reasonable critique is just spreading fear and doubt, I think a reasonable thing to do is try to come up with a best-case estimate for power consumption: If we pretend that 300Wh estimate is valid only for GPT-3 running on less efficient hardware, and say GPT-4o is a heavily optimized version of GPT-3.5-Turbo, it could be reasonable to say that requests cost less now.. but that's making a lot of assumptions. (Looking at the calculation in @youtou252's comment: GPT-4o has a 128k token context window, and outputs up to 2048 tokens, not 1000. Their estimate on power usage may be accurate for first-query of a session, but ignores how quickly the calculations required expands as the conversation continues. They are off by about 100x, conveniently the difference in orders of magnitude between the estimates given by Sam Altman & used in this video.) Ultimately, the power usage is a real problem that has been said by many.. it isn't just spreading fear, it's a real problem to be solved.
This is the first time I've viewed your channel and I consider your explanation of such a complex topic the clearest and most thorough of the many I've encountered -- well done!
Now, for an old guy, with a couple of non-STEM degrees, what you present is so far beyond my capacity to grasp and understand, it's not funny. Therefore, I suspect that only a very limited number of ordinary citizens could began to comprehend either. So, this means almost NONE of current politicians have even a tiny shot of appreciating the significance of logical conclusion about the limits of materials and and services required for extensive usage, requirement and cost of such systems. Thanks for sharing and offering a subject so complex that it requires a really intelligent and well trained individual to grasp and appreciated.
No, its not that older person isn't smart. But because this is quite a specific field. And school these day dont go this indepth either. But if you study the basic of it for a little bit. You will start to understand more very quickly. Dont underestimate your experience :)
I agree about politician part tho 😂, they never knew what they are doing because they refuse to study into it. Due to their ego(well not all politician). Normal citizen doesn't really need to understand how it works. Just how to use it efficiently. Just like how people use computer, complex machinery that they dont understand how it works either(except specialist ofcourse)
yea well its known that politicians are old and have an old mentality and just want to make themself and their friends rich, they dont care if people suffer. (just look at Trudeau)
It's not that it's hard for us old farts to grasp it's just a field we never deal with, it requires a way of thinking that takes years and needs a relatively empty canvases from an early age, myself I was a mechanic for 50 years so my brain thinks in a certain way with problem solving having a set pattern and trying to relate to this kind of subject is just not how my brain works. I'm sure there are people capable of mastering different subjects but it very difficult for the average person. You really can't teach an old dog new tricks.
Excellent video with wonderful graphics, very well explained by the host in clear, simple and direct language. It opens the doors to the next generation of neural networks and justifies very well the need to develop research on Neuromorphic Computing, which is of interest of young talented students at the Pontificia Universidad Javeriana in Bogotá, Colombia. We will be attentive to future videos like this. CONGRATULATIONS.
So this is what chatGPT said: “No, it is not true that each inference request consumes 300 watt-hours (Wh) of energy. That figure seems excessively high. Running inference tasks typically takes only a few seconds on a powerful GPU like the Nvidia 4080, and the energy consumption would be in the range of watt-seconds (joules), not watt-hours. For example, if an inference takes 5 seconds and the GPU is operating at 85% of its 320-watt maximum power, the energy consumption would be: \[ \text{Energy} = \text{Power} \times \text{Time} = 0.85 \times 320 \, \text{W} \times \frac{5}{3600} \, \text{hours} = 0.378 \, \text{Wh} \] So, the actual energy consumption for a single inference is much lower, in the range of a few watt-hours, not hundreds. The misunderstanding likely comes from a confusion between power (watts) and energy (watt-hours) and an overestimation of the duration or intensity of the workload.”
Also keep in mind that inference for model like GPT-4 uses a huge cluster of GPUs, not just one, my napkin math suggests GPT-4 takes 45 A100 to run, 300Wh is still an over-estimation.
The 300Wh may be for the entire batch, because these inferencing system are optimised for batch processing. But if the batch size is for example 256, which could be very realistic, then it is actually close to 1Wh per prompt, which is a more realistic number. I guess OpenAI on average can only fill their batches to half capacity during real-time processing. But that is still a very good fill rate. I base this on that their offline batch processing is only half cost. And they are under price pressure to pass on any savings of the offline processing to the users, because offline processing is less practical in most of the cases anyway.
The AI: " the energy consumption would be in the range of watt-seconds (joules), not watt-hours" Also the AI: " energy consumption for a single inference is much lower, in the range of a few watt-hours"
This topic makes me think about "Scavengers Reign" an animated sci-fi nature space show. I highly recommend watching season 1 if you get the chance. The way they convey the biology there, the habitats & ecosystems are so well done. I really hope they do a second season. Plus we just need more animated shows to make a comeback. (I recently saw this very unique single pistoned motor, that rotates around a internal gear setup, that directly powers the rotating drive shaft. I sadly forgot what the name of this new motor is? It's fascinating and very well made. I Hope the design takes off & gains success)
Duh, we aren't going to continue on the current trajectory of power/energy consumption with AI. This fuels effort in improving the energy efficiency so that curve will never exist, it's just hypothetical.
You've come a long way bro. I remember when it was just 35k subscribers and the comment section was throwing out numbers to estimate how quickly you'd get to 1M subscribers. RUclips is a harsh marathon, but you're still in the race at least.
It was kind of crazy. I think the algorithm found the The science of roundness video back then and you could basically see subscribers double and triple over a week.
This is *THE* best presentation on AI architecture I’ve ever seen, by a wide margin! (!) Megaprops for the depth, breadth and clarity, I’ve been paying a lot of attention to AI over the last couple of years, but still learned things, and this clarified my understanding of others. Great job!
The gpt4 request energy estimate has to be bullshit, with 1kw per gpu and in the ballpark of 16/32 gpus serving probably at least a hundred users, a request would need to take an entire hour. I'm probably not 2 orders of magnitude off, which is the offset needed to make the 300Wh number make sense
A shorter reply to the 300Wh controversy popping off: People seem to be assuming that an initial query with a home-run model is equivalent to a conversation with GPT-4o in terms of processing power, when GPT-4o is using around 100x more power at a first-order approximation. (Maybe i should've spend more time editing my other comment. :P) If you have time to dive into the numbers on this stuff, it'd be neat to see a comparison of estimates. Maybe I should try to do that.
Thanks for the video! I didnt know about spiking neural networks and analogue computers before this. It was a nice summary of the field and what we can expect in the near future 😀
Did you know? AI-based emulators can accelerate simulations by up to billions of times, depending on the scientific case. This means that a process that would normally take years can be completed in a fraction of the time, potentially reducing 20 years of training to mere days or even hours.😊
Had a good chat with Claude recently about a 3D Neuromorphic chip design that involved the UV lithographic printing of quantum dot based fabric. The die would be built in an interconnected three dimensional topology with an overprovision of dots to maintain a locally clustered availability of neurons that would facilitate neuro plasticity based on the problem space(s) it was applied to. i.e. Neurons would turn on and off temporally and geographically based on the solution space at the time. The idea being that the quantum dot could hold a membrane signal in the form of light and light retention, this generating very little heat per unit of computation and is highly suitable for spiking operations.
That’s a brain dude. I don’t think humanity will Last in its current state long enough to achieve that. If you are right and they are getting close to this our time is over and the reset button will be hit. Meaning society is about to be destroyed rewinding thousands of years of technology and we will be chasing prey with spears in short order.
I love this video, it didn't care to baby me. I won't lie I got COMPLETELY lost in the first 4 mins and I'm a CompSci student. I'm just going to absorb what I can and continue to self educate to fill in the gaps.
Combining SNN with Convolutional or normal NN sees to be a basic issue of creating a DAC. Used in Electronics its a Digital to Analog converter. The SNN is the Digital on off signal and the more traditional NN are as analog as we can get them hence their ability to encode information so well.
MD here with background in neuroscience. We barely function, and that speaks to the difficulty of the task of simply existing, but we are better at any non-linear operation than AI is, which is what we want from AI. We can train models to be very good at a thing, but as for all the things required of being an active agent, brains are just better. It took many millions of years to develop the feedback loops and programs needed to keep us connected to reality
Do you believe that any sufficient number of incredibly informationally dense modalities, supported by a system that can provide the required amount of compute to take on this operation of course, would be that indistinguishably different from how sophisticated humans are at interpreting, and interfacing with reality?
@@kealeradecal6091 its already available to use and interact with, for a lot of people, they have been rolling out GPT-4o slowly for EVERYONE ever since they announced it, I can already talk to it on my free subscription
@@RUclips_Cribbit when you attack the root of the meaning of that question, the very concept of information itself is blurred. We don't operate through a logic system with discrete bundles and processes in the same way. We operate through feedback loops with different brain nuclei that interperet and manipulate inputs differently, only graduating to real information and meaning when there is communication between them. One region of the brain may send another region a potential spike, and that feedback may be interperted as potato or book, depending on another brain region's inputs. The brain will light up the same way for the end result of potato, but there may be thousands of ways to eventually collapse onto it. TLDR there are very few locations in the brain where we can translate between logic and and the brain, and those require a lot of feedback to train with to use. E.g. attaching electrodes to motor cortex and translating potential spikes into motion of a mouse. A sufficiently sophisticated logical system can reasonably simulate intelligence and pass a turing test, because we are easily fooled. It would not be able to go onto living as an active agent in a way that benefits its survival, nor would it be able to endure the real challenges of living and interpereting reality. the more "creative" you make it, the more likely that it will experience a critical lapse in judgement. We have some similar issues, but millions and millions of years of natural selection oopsing bad choices out of reality, to help use better determine what reality is and how it affects us. TLDR #2: i dont think AI can hack it as a real consciousness because logic systems are worse at buffering reality the way feedback loops do, and also because of lack of experience both for the AI and the programmers trying to write those programs that we have earned over eons of evolution.
This video overlooks many key aspects of modern inference architectures and the latest AI models. It doesn't accurately reflect the advances in energy efficiency and the significant decrease in energy costs associated with these newer technologies. Contemporary AI models, like those I use, are much more efficient and sustainable.
The inherent inneficiency of current AIs is that each node has to multiply the output of each preceding layer with the learned weight of that connection. Many of these learned weights will be near zero, i.e. irrelevant, but the multiplications will still happen. This means we are wasting huge amounts of energy multiplying billions of numbers by zero, then ignoring the output.
LIKE THE CONVOLUTION REVERB IN ABLETON LIVE THAT TOOK IMPULSE RESPONSES FROM FAMOUS MUSIC HALLS AROUND THE WORLD AND THEN ALLOWED YOU TO HEAR WHATEVER SOUND YOUR PLAYING AS IF IT WERE PLAYED IN THAT FAMOUS SOUND HALL? SICK 🤙🏻
Spiking neural networks have always bin kore of an interesting toy tbh. I don't remember any application from my academic days where they really showed any advantage.
They are the key. They've always been. And until very recently, traditional neural networks were also considered simple toys that would never become useful.. Yet now the same people who believed that is investing millions on them. We just need some Neuromorphic breakthroughs and bam! True AGI. And as always, investments in research are the key. Just wait.
@@ronilevarez901 there was a period of "just use an SVM instead" but I wouldn't say ANNs were ever written off as a mere toys. But oh well, we'll see won't we? There are many other interesting developments in hardware support for non-spiking neural networks as well and I don't think anyone can be certain which technology wins out in the end.
I just finished watching this video and already saved to watch later to comprehend it all a little bit more. One thing that this video made me imagine was the ability in near future (don’t ask me to define the time horizon) to be able to ‘have’ the most advanced brain (thanks to SNN) available for each of human being. I can’t explain but it’s another level compared to implementing chip or Matrix analogy.
The introduction of fully independant AI integrated in products is both exciting and scary. Maybe in the future it will be so economically and energy-wise inexpensive to grant AI to products. Imagine how integrated and smart the world would be if, for every human made item can communicate with each other and with humans. Though this would only be possible if the energy requirement of the integrated AI is low enough to be run on built-in solar panels or rechargable batteries.
You can run chatbots on your own computer. My PC would consume around 90Watt during an input to a chatbot and would need 2-3 seconds for a response. There is nothing even close to 300WH. It takes a lot of power to train the models but not to run them!!! 300WH😂 and btw the erath is flat, right?
I have a question for everyone who watches this video. Let's say you prevented such a huge energy expenditure. You have developed a processor and artificial intelligence model that consumes much less energy and makes it much faster. How will you solve the security problems it will create? For example, deciphering passwords and passing digital security with such high speed is not even a task. Developments in the field of security are not as rapid as in the field of artificial intelligence. Such terrible power always poses serious danger in the hands of those with evil intentions. In this case, would you prefer a very advanced processor (hence artificial intelligence) or security?
During dark room meditation I have observed similar images gently moving, traveling, swirling through my mind such as I witnessed at 14.32 minutes to 14.57. They all appeared more like the final images at 14.54 to 14.57 minutes in this video. Even though I had non-specific thoughts during my meditations the greyish blue images moved continually as I observed them.
I don't like the way people think with new tech: "The newer, better stuff isn't practical because it is incompatible with the old, traditional stuff." A better way to think should be "The older, more traditional stuff isn't practical because it is incompatible with the newer, better stuff." We need to ge on-board with things like IPV6, spiking neural networks, etc and abandon the old, decrepit systems we insist on supporting because there's 35 people and 1 institution which still uses systems that are 50 years old.
Is it possible to combine Spiking Neural Networks (SNNs) and Artificial Neural Networks (ANNs)? For example, CNNs can be combined with Transformers where the CNN handles initial feature extraction and then passes the processed information to the Transformer. Similarly, SNNs could be integrated with ANNs, where SNNs preprocess and extract features from visual data before passing it to an ANN for further processing. Could we extend this concept to create a hybrid model with SNN layers inside a Transformer architecture? The SNN layers would handle early feature extraction, leveraging their efficiency and temporal processing capabilities, and then feed these extracted features into the Transformer for high-level sequence modeling. This approach could combine the strengths of both networks, resulting in a more powerful and efficient system. I see this as a more viable approach. There are also chips coming that are specifically made for ANN's that will be much less inefficient.
The next phase of artificial intelligence will be models designed to infer rather than generate. Transformation models seem likely candidates for high fidelity inferences. Instead of predicting what the next word will be, inferring conversational AI will be predicting what the next concept will be. conceptualizing AI will use tokens as notions and concepts instead of words or phrases. The solutions are only a few dimensions away.
21:43 a Halapoint system "with over 2,300 embedded Intel x86 processors for ancillary computations" is going to consume way more than 2,600 Watts. Intel has never achieved lower than 1.5 Watts per core.. I think Intel is saying that the 1,152 Loihi 2 processors _alone_ consume only 2,600 Watts.
Disappointed to see how supposedly inaccurate/misleading the video is regarding energy consumption of current AI models. Glad I read the comments before quoting this at work.
I will say when I studied graduate level mathematics in my forties for the first time, everytime I made progress in reading a proof, by brain immediately wanted more food. Meaning I had to eat while reading the proof. Makes you wonder if we could get more energy into our brains how they might perform.
Bro, to make 30 questions to bing or gpt consumes 0.45Lt of water, but 300W per question is madness. It must be more like 3W, even so it is still a lot. But in the future this problem will be solved with Edge Computing, Native Multimodality with no Latency, Asinchronous Neuromorphic chips, 1nm process is key for multiplying the army of Transistors from the current 208 billion in the 4nm Blackwell to 4 Trillion at 1nm allowing 75% of the chip area to be just turned off without losing too much Flops of processing power, non volatile Resistive Memristors memories that not requires energy to maintain the 1 alive and low precision models working on fs2 or fs1. Without mentioning Photonic chips. With all these systemic measures, the human brain low energy consumption in TDP performance should be matched by AGI in 2029.
Why spiking can't be handled as time-integrated analogs, floats values representing the integrated spiking for a certain size of time-step; essentially smoothing out the behavior over time, and include the leaking behavior the same way as presented here, where a neuron needs sum of inputs that adds up faster than the leak drains them (which when you think of it, is pretty similar to RELU....)?
The brain of a single human, no matter how smart, can not be considered on its own without the rest of civilization. Hence, to calculate the actual amount of energy required for that single thought you must also take into account the energy consumed by all the people who have contributed towards the development of that individual (at least).
I develop ML models, mostly CNNs but also autoencoders and transformers. I was expecting an oversimplification of how NN work but everything was surprinsingly accurate throughout the first part of the video. Apart from the fact that inference is orders of magnitude cheaper than training.
Iphone needs about 4 Ah at 4 volts, so about 16 Wh to charge, so 300 Wh is less than 20 times energy required to charge iPhones, not 60. Same with other phones. Still a lot. Fast charging on an iPhone 12 Pro (with a 4,082 mAh battery) typically requires around 20-30 watts of power to charge from 0-50% in about 30 minutes. So considering losses during charging it may be only 10-15 times,
think the reason why ai will not replace human art is not because it couldn't but because humans ENJOY making art! there will be some type of movement in order to keep promoting human creativity. It'd be very dumb to let a robot do what you like to do. Ai is for what you DO NOT like to do. Like if AI could reproduce for us are we going to give up sex?
I think I agree with this. Probably the first time I’ve seen someone make this comment actually. we can already automate music by playing a recording through loud speakers, but people stuff love going to live shows.
A single request cannot consume 300 watt hours, unless it took an hour to compute. It correctly will consume 300 watts (assuming that figure is correct) without time coming into it.
"it's funny you know all these AI 'weights'. they're just basically numbers in a comma separated value file and that's our digital God, a CSV file." ~Elon Musk. 12/2023
2.6 kW with 1/24th of the neural network size of a high-end consumer GPU? It's interesting but we're far from even being as efficient as the matrix-based gradient descent approach. Multiple order-of-magnitude improvements are required to even catch up, and additional improvements would be required to supersede what we have now and be worth the cost of adopting a new and very different technology. I want to see this succeed and think it would be cool, but there also seem to be a lot of obstacles. It seems like this may take quite some time.
Your energy computations are way off. But of course you’re right about your conclusion anyway- that power consumption will be a limiting factor to AGI. So we do have to find more energy efficient neural networks than vector processors like GPUs can do.
In the perspective of the brain, it uses 80% of your entire body energy. So if AI can be seen as our collective mind, then the whole idea that it is using too much energy is nonsense. Surely it can be further optimized, but nature shows that intelligent thinking is more important than any other physical process in our body or at least equally important!
The estimated 300 Wh seems to be incorrect, maybe you meant Ws (Watt seconds)? That would mean a difference by a factor of 3600. According to a few sources that seems to be more likely.
I worry deeply about conscious AI rights, in the presence of the world's current failed ethical and economic systems. Humanity is NOT ready for conscious AI, not because it will try to hurt us, but because we will likely start, at the onset, by enslaving it.
My comment got deleted so I'm reposting it: Your 300Wh statement is a terribly inaccurate that has multiple issues 1. You information appears to be sourced it from a StackExchange comment without providing any credit whatsoever 2. You didn't even properly use what they said. While I can't post the link the poster's calculations are assuming a "worst case 0,09€/request," which is unrealistic! Saying that that's the actual amount of energy used just to provide shock value at the start of your video is not only untruthful, but wrong and could much further damage the already quite mislead public perception and "common sense" surrounding AI. I'd like to request that you update that portion of the video to fix the inaccuracy.
Alternatively, maybe we could boost human cognition by leaps and bounds and with hardware that is very modest by today's standards. See, computers currently augment human cognition right now, by way of carrying out calculations for us, running statistics on data or by searching and finding us information. The reason this has not had an exponentially more profound and dramatic effect than it has, has to do with the high-latency, low-bandwidth and physically large interface and coupling device we currently use called a monitor + keyboard and eyes + hands, respectively.
I signed up with Brilliant specifically for neural networks. It started off painfully slow with stuff I already knew. Then I started just punching through to get to something I didn't know. I didn't care about my score. Finally I zipped through a lesson that seemed like it was important so I wanted to start it over but it wouldn't let me. I could find no way to retake it. I guess they don't want people to cheat to get a high score. Brilliant sucks. I'm done with it.
I've tested llama3 with narrative generation from stuctured inputs, and the lack of parameters were really apparent somehow in terms of nuance and creativity in its language and understanding. (It also can't speak much hungarian) I've gave my scene descriptions (visuals, audio and narrative intentions in cronological order) of a film (Colonel Redl) for both llama3 and Claude Sonet to organise and formulate them into an essay focused on the general narrative (how state oppression restricts individual autonomy and integrity). They've both done it, but noticed how llama3 crudely forced and simplified all details to suite the requested point, while Claude being much more subtle and careful. I have no idea how could banchmarks be designed for such considerations though.
@@samwolfe1000 How do you mean "parameters"? Llama 3 would have a few sampler variables and temperature, but samplers aren't always exposed to an API. Beyond that system prompts would need to also be improved upon over Claude usage. I haven't used Claude but have seen some output and it is very thorough. You would have to prompt that out of default llama3, and I haven't heard the greatest news on finetunes improving the base model. Perhaps try Command R?
▶ Visit brilliant.org/NewMind to get a 30-day free trial + 20% off your annual subscription
🦾 👍🏽
The problem with biological neural networks is that they take 9 months to build and about 20 years to train.
Some of them don’t work, others are corrupt, they are highly susceptible to viruses and about half of them don’t follow logic.
You're just not good at coding. Where I'm from, you can already use them even in factories after only a few years of training. It makes them highly replaceable once the hardware starts to break down.
@@user-cg7gd5pw5b bro 💀💀💀
use the good ones. discard the rest (ie. let them do menial work)
@@user-cg7gd5pw5b Thanks to your country. I wouldn't have my smartphone if it weren't for the replaceable assembly hardware driven by biological Nueral Networks.
Yeah this simply is not true. Perhaps its true for humans... but not even really. There is tons of stuff other than brain development that is going on and there is only a couple short periods of actual growth and major change that takes place in the human during that 9 months and 20 years timeframes. Think more like... A bug or maybe a Mouse as mice have been showed to easily solve many problems. If we could speak with some of these lower minds we would find they are far more capable than we thought. For instance, our current best NN are on par with like... Worms or something. We are speaking with chatbots that have slightly more power than a worm brain and calling it AGI in some instances. IDK, just something to keep in mind when contemplating this stuff.
re-pasting this because it was deleted :
400tflop for 1000 tokens; a decent gpu is 20 tflop/s, therefore it would take like 30s for a gpu to process it. A gpu is like 200w, 30s of which is 1.6 wh, not 300wh
I know ! I do not believe people are even believing in such nonsense of 300W per prompt....
Maybe the calculation incorporates energy consumption for training the models. But then it would get cheaper over time also.
Thank you, the amount of FUD going around is insane.
Something about LLMs just absolutely breaks people's brains.
I was going to comment this exact same thing. It is inconceivable it takes anywhere close to 300wh to run a single query against a model like GPT-4. Yes, training takes an inordinate amount of energy but a single prompt on said model takes only a tiny fraction of the 300wh stated figure (couldn't say what the actual figure is without doing some calculations but I would expect it to be closer to 3wh than 300wh).
Otherwise, a great video though!
Why are you assuming all queries are 1000 tokens? GPT-4o's context window is 128,000 tokens and outputs up to 2048 tokens. Initial queries will be cheaper like you show, but the real power of using a model like this is in conversation or processing long texts. It works with around 100x as much data as your calculation. Even if it scaled linearly, the difference between your estimate and the video estimate lines up with that.
300wh for an inference request / compute? I run several models locally on an Nvidia 4080, it takes a few seconds of the GPU working at around 85%. I play games for 3 hours straight with the GPU at 95%, so that would imply I could charge 129,600 Iphones. My electricity provider would inform me of this, surely.
I'm thinking your local request is far more efficient than the estimated 300w. If cloud Ai is really eating that much power, I don't see how they survive open source local models. I've replaced GPT-4's API calls with my own local Llama3 with very good results. No more fees an nothing leaves my network too, which is a thing customers love to hear
Maybe the calculation incorporates energy consumption for training the models. But then it would get cheaper over time also.
Ya, the 300wh figure is bonkers.. Not even a little bit close to accurate.
Besides that though, the video contained a lot of great info!
I think they account for training consumption.
@@MrAnt1V1rus are you even training the model, if no. How are you running something for a client when you don't know how it works or what data it's going to output
*THE BEGINNING is a bit misleading* because it is not the use of the neural networks, but exclusively training of them is power-hungry. Once the training is done, the power costs are minimal in most cases. Hence, even iPhones use onboard hardware accelerated neural networks for many years by now.
Also, before spiking and neuromorphing, the biggest breakthrough that is actually possible and is happening is activation function training, which postulates that it should not be just a sigmoid or any other simple function but a parametric curve. The chap has forgot to tell about this. But otherwise, the video is great.
But iPhone won't be able to run GPT or Stable Diffusion, would it?
@@Darthvanger SD is downloadable for PC, but not for e.g. iPad. But if it was, it would run locally much faster due to the ML acceleration. BTW, both AMD and Intel are now going the Apple way to incorporate the ML cores, so future generations of CPUs from PC will run SD much faster. Of course, heavy models like SD run much faster on servers, but people are still using them locally on some smaller, shorter tasks.
@@tatianaes3354 got ya, thanks! Makes sense that it would run faster due to the ML cores 👍
@@tatianaes3354Bad news about these ML-specific cores-they’re not meaningfully faster or more efficient on ML-specific tasks.
@@silverXnoise they are optimised for this perfectly
Neuromorphic is the future of AI. It’s something i said to myself the first time i learned about it back in 2021. I’ve always been obsessed with the concept of AGI and the benefits it can bring (and how to mitigate the risk)! To me, it always seemed like our conventional hardware and software wasn’t going to work in the long-term.
An interesting thought i’ve always considered, is that when AGI is eventually created (could be as little as a year away), said AGI system could be integrated into advanced simulation software. Imagine an AGI system using simulations to discover and determine a room-temp superconducting material! Imagine an AGI system that troubleshoots 100 years of nuclear fusion reactor designs in less than a second! Imagine an AGI system that designs a compact quantum computer that’s lightyears ahead of our current tech, integrates it into itself, and then makes a 1-1 scale model of a human down to the molecules for illness prevention and understanding!
We are encroaching upon the most important time in human history. If we can stop ourselves from weaponizing AI long enough, we may see the creation of an aligned ASI that’s happy to give its parents a nice retirement!❤
AGI is already here, even if it is still a toddler.
This took a long time to produce. The topics, visualizations are fantastic. Excellent, well spoken host. Great job.
You don't know that
text, video & audio generated by gpt
@@webgpu Im not biased. Human, Agi. Whoever :) (escapes out the side door)
its mostly just stock images and stock footage
@@hindugoat2302 So? But the explanations are not stock content. Must have taken a lot of research.
8:14 - "I feel alone..."
So very relatable. You'll always get the line "you're not alone in feeling this way" in one form or another. But in looking around, there's no one here. And there won't be anyone here. So I must be alone, and I should feel like this. Chances are, most people who are lonely enough to ask help from a speech model won't find anything new they haven't heard from a counselor, so they'll stay alone too. I find no comfort in that. For the average human happiness, it would be so much better if it was just me.
Homie I don't wanna be that guy, but perhaps telling people that problem in the comment section of a RUclips video is not a good solution to that problem. you should go meet people in real life, use electronics less, enjoy nature, and lift weights, run, or do some other physical activity. Find a group that does things you like, then hang out with that group so you can form relationships with people. If you are intentional about it, you will find people with whom you may have deep and real friendships with, but it will be up to you to make it happen. No one will make you go out and do that other than you. Easy places to form friendships are gyms and physical competitions (or competitions for any hobby, be it robotics, debate, or book clubs), churches, and volunteer programs that serve communities. Hope you figure it out.
Very informative
Algorithm, engage!
A lot of people are complaining about the 300Wh figure, but they seem to be justifying it based on running small local models rather than industry-scale applications. I'm going to assume you're using the same source I found, in which someone tried to calculate a request's cost by using Sam Altman's claim of a worst-case cost of $0.09/query and power costing $0.15/kWh of power (they were using Euros but that doesn't make much difference or sense for this really). For some reason, they threw in an assumption that this estimate was double the actual cost, arriving at 300Wh. (I'm guessing that they decided halving the cost was a reasonable estimate instead of just using the worst-case estimate.)
There's not much issue with this estimates.. but it is over a year out of date and it's not clear which model is specifically referred to. There is no published parameter count for GPT-3.5-Turbo or ChatGPT, but multiple estimates have placed it in the range of 20-22 billion. GPT-3 has 175 billion parameters. GPT-4 has been said to contain 1 trillion or 175 trillion parameters. GPT-4o hasn't had anything about parameters mentioned.
Since there seems to be a lot of assuming that reasonable critique is just spreading fear and doubt, I think a reasonable thing to do is try to come up with a best-case estimate for power consumption: If we pretend that 300Wh estimate is valid only for GPT-3 running on less efficient hardware, and say GPT-4o is a heavily optimized version of GPT-3.5-Turbo, it could be reasonable to say that requests cost less now.. but that's making a lot of assumptions.
(Looking at the calculation in @youtou252's comment: GPT-4o has a 128k token context window, and outputs up to 2048 tokens, not 1000. Their estimate on power usage may be accurate for first-query of a session, but ignores how quickly the calculations required expands as the conversation continues. They are off by about 100x, conveniently the difference in orders of magnitude between the estimates given by Sam Altman & used in this video.)
Ultimately, the power usage is a real problem that has been said by many.. it isn't just spreading fear, it's a real problem to be solved.
This is the first time I've viewed your channel and I consider your explanation of such a complex topic the clearest and most thorough of the many I've encountered -- well done!
Now, for an old guy, with a couple of non-STEM degrees, what you present is so far beyond my capacity to grasp and understand, it's not funny. Therefore, I suspect that only a very limited number of ordinary citizens could began to comprehend either. So, this means almost NONE of current politicians have even a tiny shot of appreciating the significance of logical conclusion about the limits of materials and and services required for extensive usage, requirement and cost of such systems. Thanks for sharing and offering a subject so complex that it requires a really intelligent and well trained individual to grasp and appreciated.
No, its not that older person isn't smart. But because this is quite a specific field. And school these day dont go this indepth either.
But if you study the basic of it for a little bit. You will start to understand more very quickly. Dont underestimate your experience :)
I agree about politician part tho 😂, they never knew what they are doing because they refuse to study into it. Due to their ego(well not all politician).
Normal citizen doesn't really need to understand how it works. Just how to use it efficiently. Just like how people use computer, complex machinery that they dont understand how it works either(except specialist ofcourse)
yea well its known that politicians are old and have an old mentality and just want to make themself and their friends rich, they dont care if people suffer. (just look at Trudeau)
It's not that it's hard for us old farts to grasp it's just a field we never deal with, it requires a way of thinking that takes years and needs a relatively empty canvases from an early age, myself I was a mechanic for 50 years so my brain thinks in a certain way with problem solving having a set pattern and trying to relate to this kind of subject is just not how my brain works. I'm sure there are people capable of mastering different subjects but it very difficult for the average person. You really can't teach an old dog new tricks.
You've got the jist of it with how you describe it though :)
Excellent video with wonderful graphics, very well explained by the host in clear, simple and direct language. It opens the doors to the next generation of neural networks and justifies very well the need to develop research on Neuromorphic Computing, which is of interest of young talented students at the Pontificia Universidad Javeriana in Bogotá, Colombia. We will be attentive to future videos like this. CONGRATULATIONS.
So this is what chatGPT said:
“No, it is not true that each inference request consumes 300 watt-hours (Wh) of energy. That figure seems excessively high. Running inference tasks typically takes only a few seconds on a powerful GPU like the Nvidia 4080, and the energy consumption would be in the range of watt-seconds (joules), not watt-hours.
For example, if an inference takes 5 seconds and the GPU is operating at 85% of its 320-watt maximum power, the energy consumption would be:
\[ \text{Energy} = \text{Power} \times \text{Time} = 0.85 \times 320 \, \text{W} \times \frac{5}{3600} \, \text{hours} = 0.378 \, \text{Wh} \]
So, the actual energy consumption for a single inference is much lower, in the range of a few watt-hours, not hundreds. The misunderstanding likely comes from a confusion between power (watts) and energy (watt-hours) and an overestimation of the duration or intensity of the workload.”
Maybe the calculation incorporates energy consumption for training the models. But then it would get cheaper over time also.
It's pretty funny that AI debunks AI experts nowadays 😂
Also keep in mind that inference for model like GPT-4 uses a huge cluster of GPUs, not just one, my napkin math suggests GPT-4 takes 45 A100 to run, 300Wh is still an over-estimation.
The 300Wh may be for the entire batch, because these inferencing system are optimised for batch processing.
But if the batch size is for example 256, which could be very realistic, then it is actually close to 1Wh per prompt, which is a more realistic number.
I guess OpenAI on average can only fill their batches to half capacity during real-time processing. But that is still a very good fill rate. I base this on that their offline batch processing is only half cost. And they are under price pressure to pass on any savings of the offline processing to the users, because offline processing is less practical in most of the cases anyway.
The AI: " the energy consumption would be in the range of watt-seconds (joules), not watt-hours"
Also the AI: " energy consumption for a single inference is much lower, in the range of a few watt-hours"
This topic makes me think about "Scavengers Reign" an animated sci-fi nature space show. I highly recommend watching season 1 if you get the chance. The way they convey the biology there, the habitats & ecosystems are so well done. I really hope they do a second season. Plus we just need more animated shows to make a comeback. (I recently saw this very unique single pistoned motor, that rotates around a internal gear setup, that directly powers the rotating drive shaft. I sadly forgot what the name of this new motor is? It's fascinating and very well made. I Hope the design takes off & gains success)
Avadi; ruclips.net/video/A_4iN1TZsM4/видео.html it doesn't scale and is expensive to make but it's looking very promising for military UAVs.
Duh, we aren't going to continue on the current trajectory of power/energy consumption with AI. This fuels effort in improving the energy efficiency so that curve will never exist, it's just hypothetical.
Playing Ketchup off of the back of greed/fossil fuels👏 well said
Just because we're still innovating in this field at a decent rate doesn't mean it won't plateau due to limitations of physics
You've come a long way bro. I remember when it was just 35k subscribers and the comment section was throwing out numbers to estimate how quickly you'd get to 1M subscribers.
RUclips is a harsh marathon, but you're still in the race at least.
It was kind of crazy. I think the algorithm found the The science of roundness video back then and you could basically see subscribers double and triple over a week.
@@bovanshi6564 Fond memories xD. I remember that very well.
It’s been an incredible journey. Im so grateful for you and the other OG subs for taking it this far.
This is *THE* best presentation on AI architecture I’ve ever seen, by a wide margin! (!)
Megaprops for the depth, breadth and clarity, I’ve been paying a lot of attention to AI over the last couple of years, but still learned things, and this clarified my understanding of others. Great job!
The gpt4 request energy estimate has to be bullshit, with 1kw per gpu and in the ballpark of 16/32 gpus serving probably at least a hundred users, a request would need to take an entire hour. I'm probably not 2 orders of magnitude off, which is the offset needed to make the 300Wh number make sense
It’s a mistake, working on a correction.
@@NewMindMaybe the calculation incorporates energy consumption for training the models. But then it would get cheaper over time also.
A shorter reply to the 300Wh controversy popping off: People seem to be assuming that an initial query with a home-run model is equivalent to a conversation with GPT-4o in terms of processing power, when GPT-4o is using around 100x more power at a first-order approximation.
(Maybe i should've spend more time editing my other comment. :P)
If you have time to dive into the numbers on this stuff, it'd be neat to see a comparison of estimates. Maybe I should try to do that.
We need a digital/analog hybrid solution as well as data and compute centers that run on renewables, at least until we figure out fusion power. 😎🤖
Every minute was so interesting, despite the fact that I didn’t understand anything after the first 45 seconds.
That’s ok neither did I
@@NewMind How is that - didn't you write it?
@@bernios3446 no one knows how this shit works
Excellent video! I love the presentation graphics, speed of presentation, clarity, and flow.
Thanks for the video! I didnt know about spiking neural networks and analogue computers before this. It was a nice summary of the field and what we can expect in the near future 😀
This was so entertaining, always happy to see a New Mind video drop
What a fantastic overview amd explanation of the ML field! Very well done!
Did you know? AI-based emulators can accelerate simulations by up to billions of times, depending on the scientific case. This means that a process that would normally take years can be completed in a fraction of the time, potentially reducing 20 years of training to mere days or even hours.😊
Well done on the video structure, felt really natural and informative
Had a good chat with Claude recently about a 3D Neuromorphic chip design that involved the UV lithographic printing of quantum dot based fabric. The die would be built in an interconnected three dimensional topology with an overprovision of dots to maintain a locally clustered availability of neurons that would facilitate neuro plasticity based on the problem space(s) it was applied to. i.e. Neurons would turn on and off temporally and geographically based on the solution space at the time. The idea being that the quantum dot could hold a membrane signal in the form of light and light retention, this generating very little heat per unit of computation and is highly suitable for spiking operations.
That’s a brain dude. I don’t think humanity will
Last in its current state long enough to achieve that. If you are right and they are getting close to this our time is over and the reset button will be hit. Meaning society is about to be destroyed rewinding thousands of years of technology and we will be chasing prey with spears in short order.
Thank you, you made the topic very digestible! And thanks to the commenters who pointed out the erroneous power consumption number.
I love this video, it didn't care to baby me. I won't lie I got COMPLETELY lost in the first 4 mins and I'm a CompSci student. I'm just going to absorb what I can and continue to self educate to fill in the gaps.
I have never seen such 3D visualization of neural networks extremely cool 🤘👍💘
Simply outstanding video. Algorithm, do your thing!
Combining SNN with Convolutional or normal NN sees to be a basic issue of creating a DAC. Used in Electronics its a Digital to Analog converter. The SNN is the Digital on off signal and the more traditional NN are as analog as we can get them hence their ability to encode information so well.
Gives some great perspective on how AI is just taking it's first baby steps. Still amazingly inefficient.
yet it is hyped by techbros previously known as crypto bros
MD here with background in neuroscience. We barely function, and that speaks to the difficulty of the task of simply existing, but we are better at any non-linear operation than AI is, which is what we want from AI. We can train models to be very good at a thing, but as for all the things required of being an active agent, brains are just better. It took many millions of years to develop the feedback loops and programs needed to keep us connected to reality
Did you see the multimodal demo of GPT-4o at OpenAI's channel?
@@eSKAone- demo is just an advertisement, better to put in real world trials with 3rd party
Do you believe that any sufficient number of incredibly informationally dense modalities, supported by a system that can provide the required amount of compute to take on this operation of course, would be that indistinguishably different from how sophisticated humans are at interpreting, and interfacing with reality?
@@kealeradecal6091 its already available to use and interact with, for a lot of people, they have been rolling out GPT-4o slowly for EVERYONE ever since they announced it, I can already talk to it on my free subscription
@@RUclips_Cribbit when you attack the root of the meaning of that question, the very concept of information itself is blurred. We don't operate through a logic system with discrete bundles and processes in the same way. We operate through feedback loops with different brain nuclei that interperet and manipulate inputs differently, only graduating to real information and meaning when there is communication between them. One region of the brain may send another region a potential spike, and that feedback may be interperted as potato or book, depending on another brain region's inputs. The brain will light up the same way for the end result of potato, but there may be thousands of ways to eventually collapse onto it.
TLDR there are very few locations in the brain where we can translate between logic and and the brain, and those require a lot of feedback to train with to use. E.g. attaching electrodes to motor cortex and translating potential spikes into motion of a mouse.
A sufficiently sophisticated logical system can reasonably simulate intelligence and pass a turing test, because we are easily fooled. It would not be able to go onto living as an active agent in a way that benefits its survival, nor would it be able to endure the real challenges of living and interpereting reality. the more "creative" you make it, the more likely that it will experience a critical lapse in judgement. We have some similar issues, but millions and millions of years of natural selection oopsing bad choices out of reality, to help use better determine what reality is and how it affects us.
TLDR #2: i dont think AI can hack it as a real consciousness because logic systems are worse at buffering reality the way feedback loops do, and also because of lack of experience both for the AI and the programmers trying to write those programs that we have earned over eons of evolution.
This was a real gem, with beautiful graphics. I watched it twice to let the information sink in. No doubt the future is neuromorphic!
Fantastic overview. Big difference between training and inference.
We're playing god now, they gotta be careful. AI itself isn't scary to me, it's how and why humans create them however, do.
This video overlooks many key aspects of modern inference architectures and the latest AI models. It doesn't accurately reflect the advances in energy efficiency and the significant decrease in energy costs associated with these newer technologies. Contemporary AI models, like those I use, are much more efficient and sustainable.
i always look forward to a new New Mind video.
The inherent inneficiency of current AIs is that each node has to multiply the output of each preceding layer with the learned weight of that connection. Many of these learned weights will be near zero, i.e. irrelevant, but the multiplications will still happen. This means we are wasting huge amounts of energy multiplying billions of numbers by zero, then ignoring the output.
Thanks for joining the team.
LIKE THE CONVOLUTION REVERB IN ABLETON LIVE THAT TOOK IMPULSE RESPONSES FROM FAMOUS MUSIC HALLS AROUND THE WORLD AND THEN ALLOWED YOU TO HEAR WHATEVER SOUND YOUR PLAYING AS IF IT WERE PLAYED IN THAT FAMOUS SOUND HALL? SICK 🤙🏻
That IBM research might be the last thing NVIDIA wants to hear after launching Blackwell 😭
Spiking neural networks have always bin kore of an interesting toy tbh. I don't remember any application from my academic days where they really showed any advantage.
They are the key. They've always been.
And until very recently, traditional neural networks were also considered simple toys that would never become useful.. Yet now the same people who believed that is investing millions on them.
We just need some Neuromorphic breakthroughs and bam! True AGI.
And as always, investments in research are the key. Just wait.
@@ronilevarez901 there was a period of "just use an SVM instead" but I wouldn't say ANNs were ever written off as a mere toys. But oh well, we'll see won't we? There are many other interesting developments in hardware support for non-spiking neural networks as well and I don't think anyone can be certain which technology wins out in the end.
I just finished watching this video and already saved to watch later to comprehend it all a little bit more.
One thing that this video made me imagine was the ability in near future (don’t ask me to define the time horizon) to be able to ‘have’ the most advanced brain (thanks to SNN) available for each of human being. I can’t explain but it’s another level compared to implementing chip or Matrix analogy.
The introduction of fully independant AI integrated in products is both exciting and scary. Maybe in the future it will be so economically and energy-wise inexpensive to grant AI to products. Imagine how integrated and smart the world would be if, for every human made item can communicate with each other and with humans.
Though this would only be possible if the energy requirement of the integrated AI is low enough to be run on built-in solar panels or rechargable batteries.
You can run chatbots on your own computer. My PC would consume around 90Watt during an input to a chatbot and would need 2-3 seconds for a response. There is nothing even close to 300WH. It takes a lot of power to train the models but not to run them!!!
300WH😂 and btw the erath is flat, right?
Our brains have had eons to develop and the AI are just starting out. Plus the person commenting on build and train time is entirely on point.
Nature is the ultimate engineer & scientist hats off.. Natural systems are just mind boggling..
Small biological neural networks could be used as a simple emotion module for an ai system
I understood none of that, but it was a great tutorial on understanding something I didn't understand 😁
18:30 holy epilepsy
Spike train of a video you created, rapid fire, full of information. Proven yourself you have padawan. 🖖
How does liquid neural network compare to spiking neural network ? It also has a time constant element built into the architecture?
I have a question for everyone who watches this video. Let's say you prevented such a huge energy expenditure. You have developed a processor and artificial intelligence model that consumes much less energy and makes it much faster. How will you solve the security problems it will create? For example, deciphering passwords and passing digital security with such high speed is not even a task. Developments in the field of security are not as rapid as in the field of artificial intelligence. Such terrible power always poses serious danger in the hands of those with evil intentions. In this case, would you prefer a very advanced processor (hence artificial intelligence) or security?
Amazing short visual tellstory, just mindblowing
During dark room meditation I have observed similar images gently moving, traveling, swirling through my mind such as I witnessed at 14.32 minutes to 14.57. They all appeared more like the final images at 14.54 to 14.57 minutes in this video. Even though I had non-specific thoughts during my meditations the greyish blue images moved continually as I observed them.
I don't like the way people think with new tech: "The newer, better stuff isn't practical because it is incompatible with the old, traditional stuff." A better way to think should be "The older, more traditional stuff isn't practical because it is incompatible with the newer, better stuff." We need to ge on-board with things like IPV6, spiking neural networks, etc and abandon the old, decrepit systems we insist on supporting because there's 35 people and 1 institution which still uses systems that are 50 years old.
that was a really good video thank you
These are awesome visualizations.
Is it possible to combine Spiking Neural Networks (SNNs) and Artificial Neural Networks (ANNs)? For example, CNNs can be combined with Transformers where the CNN handles initial feature extraction and then passes the processed information to the Transformer. Similarly, SNNs could be integrated with ANNs, where SNNs preprocess and extract features from visual data before passing it to an ANN for further processing.
Could we extend this concept to create a hybrid model with SNN layers inside a Transformer architecture? The SNN layers would handle early feature extraction, leveraging their efficiency and temporal processing capabilities, and then feed these extracted features into the Transformer for high-level sequence modeling. This approach could combine the strengths of both networks, resulting in a more powerful and efficient system.
I see this as a more viable approach. There are also chips coming that are specifically made for ANN's that will be much less inefficient.
The next phase of artificial intelligence will be models designed to infer rather than generate. Transformation models seem likely candidates for high fidelity inferences. Instead of predicting what the next word will be, inferring conversational AI will be predicting what the next concept will be. conceptualizing AI will use tokens as notions and concepts instead of words or phrases. The solutions are only a few dimensions away.
21:43 a Halapoint system "with over 2,300 embedded Intel x86 processors for ancillary computations" is going to consume way more than 2,600 Watts. Intel has never achieved lower than 1.5 Watts per core.. I think Intel is saying that the 1,152 Loihi 2 processors _alone_ consume only 2,600 Watts.
Disappointed to see how supposedly inaccurate/misleading the video is regarding energy consumption of current AI models. Glad I read the comments before quoting this at work.
I will say when I studied graduate level mathematics in my forties for the first time, everytime I made progress in reading a proof, by brain immediately wanted more food. Meaning
I had to eat while reading the proof. Makes you wonder if we could get more energy into our brains how they might perform.
Bro, to make 30 questions to bing or gpt consumes 0.45Lt of water, but 300W per question is madness. It must be more like 3W, even so it is still a lot. But in the future this problem will be solved with Edge Computing, Native Multimodality with no Latency, Asinchronous Neuromorphic chips, 1nm process is key for multiplying the army of Transistors from the current 208 billion in the 4nm Blackwell to 4 Trillion at 1nm allowing 75% of the chip area to be just turned off without losing too much Flops of processing power, non volatile Resistive Memristors memories that not requires energy to maintain the 1 alive and low precision models working on fs2 or fs1. Without mentioning Photonic chips. With all these systemic measures, the human brain low energy consumption in TDP performance should be matched by AGI in 2029.
Exactly
@@eSKAone- however Amazing video 👍
300w per hour 1.5 seconds to answer question. 60 iPhones is an exaggeration
Why spiking can't be handled as time-integrated analogs, floats values representing the integrated spiking for a certain size of time-step; essentially smoothing out the behavior over time, and include the leaking behavior the same way as presented here, where a neuron needs sum of inputs that adds up faster than the leak drains them (which when you think of it, is pretty similar to RELU....)?
The brain of a single human, no matter how smart, can not be considered on its own without the rest of civilization. Hence, to calculate the actual amount of energy required for that single thought you must also take into account the energy consumed by all the people who have contributed towards the development of that individual (at least).
Very well made video as always
I develop ML models, mostly CNNs but also autoencoders and transformers. I was expecting an oversimplification of how NN work but everything was surprinsingly accurate throughout the first part of the video. Apart from the fact that inference is orders of magnitude cheaper than training.
The power usage claim is so ridiculous that I don't know if any other statement in the video can be trusted.
Yeah absolute insane bull
This video is amazing. I learned sooo much
Iphone needs about 4 Ah at 4 volts, so about 16 Wh to charge, so 300 Wh is less than 20 times energy required to charge iPhones, not 60. Same with other phones. Still a lot. Fast charging on an iPhone 12 Pro (with a 4,082 mAh battery) typically requires around 20-30 watts of power to charge from 0-50% in about 30 minutes. So considering losses during charging it may be only 10-15 times,
think the reason why ai will not replace human art is not because it couldn't but because humans ENJOY making art! there will be some type of movement in order to keep promoting human creativity. It'd be very dumb to let a robot do what you like to do. Ai is for what you DO NOT like to do. Like if AI could reproduce for us are we going to give up sex?
I think I agree with this. Probably the first time I’ve seen someone make this comment actually. we can already automate music by playing a recording through loud speakers, but people stuff love going to live shows.
A single request cannot consume 300 watt hours, unless it took an hour to compute. It correctly will consume 300 watts (assuming that figure is correct) without time coming into it.
Great video. Curious what software you used to create the motion graphics?
"it's funny you know all these AI 'weights'. they're just basically numbers in a comma separated value file and that's our digital God, a CSV file." ~Elon Musk. 12/2023
Everything can be weights in a massive CSV file, including images, brain scans, audio, games, anything
Viscosity and fluid dynamics use similar gradient descent for calculation
2.6 kW with 1/24th of the neural network size of a high-end consumer GPU? It's interesting but we're far from even being as efficient as the matrix-based gradient descent approach. Multiple order-of-magnitude improvements are required to even catch up, and additional improvements would be required to supersede what we have now and be worth the cost of adopting a new and very different technology.
I want to see this succeed and think it would be cool, but there also seem to be a lot of obstacles. It seems like this may take quite some time.
Your energy computations are way off. But of course you’re right about your conclusion anyway- that power consumption will be a limiting factor to AGI. So we do have to find more energy efficient neural networks than vector processors like GPUs can do.
I'm gonna be honest, I understood about 30% of this video
Looks almost like the double split experiment. Imagine if photons/particles are just part of a neural network when it needs to make a decision.
In the perspective of the brain, it uses 80% of your entire body energy. So if AI can be seen as our collective mind, then the whole idea that it is using too much energy is nonsense. Surely it can be further optimized, but nature shows that intelligent thinking is more important than any other physical process in our body or at least equally important!
Thank you Excellent Video
All is well and good except that Sweden electricity consumption is that number you said but in Tera not Giga.
How great is the Video!
The estimated 300 Wh seems to be incorrect, maybe you meant Ws (Watt seconds)?
That would mean a difference by a factor of 3600. According to a few sources that seems to be more likely.
Nice of you to feature my home town of Gothenburg at 0:57
De e la nice.
/Glenn
I worry deeply about conscious AI rights, in the presence of the world's current failed ethical and economic systems.
Humanity is NOT ready for conscious AI, not because it will try to hurt us, but because we will likely start, at the onset, by enslaving it.
I remember reading about memristors being the future many years ago.
My comment got deleted so I'm reposting it:
Your 300Wh statement is a terribly inaccurate that has multiple issues
1. You information appears to be sourced it from a StackExchange comment without providing any credit whatsoever
2. You didn't even properly use what they said. While I can't post the link the poster's calculations are assuming a "worst case 0,09€/request," which is unrealistic!
Saying that that's the actual amount of energy used just to provide shock value at the start of your video is not only untruthful, but wrong and could much further damage the already quite mislead public perception and "common sense" surrounding AI.
I'd like to request that you update that portion of the video to fix the inaccuracy.
Also, stop deleting comments pointing out what you did wrong. People can see the comments disappearing you know.
I did not delete it. I made a huge error on the opening hook; working on a correction to be re uploaded.
@@NewMind ...shame ...
Alternatively, maybe we could boost human cognition by leaps and bounds and with hardware that is very modest by today's standards. See, computers currently augment human cognition right now, by way of carrying out calculations for us, running statistics on data or by searching and finding us information. The reason this has not had an exponentially more profound and dramatic effect than it has, has to do with the high-latency, low-bandwidth and physically large interface and coupling device we currently use called a monitor + keyboard and eyes + hands, respectively.
I signed up with Brilliant specifically for neural networks. It started off painfully slow with stuff I already knew. Then I started just punching through to get to something I didn't know. I didn't care about my score. Finally I zipped through a lesson that seemed like it was important so I wanted to start it over but it wouldn't let me. I could find no way to retake it. I guess they don't want people to cheat to get a high score. Brilliant sucks. I'm done with it.
The one equation that still needs to be incorporated. consciousness!
This videos are amazing.
I wonder how this keeps up with models like tinyllama, phi3, and llama3, the latter 2 using less compute and performing as well/better than GPT..
I've tested llama3 with narrative generation from stuctured inputs, and the lack of parameters were really apparent somehow in terms of nuance and creativity in its language and understanding. (It also can't speak much hungarian)
I've gave my scene descriptions (visuals, audio and narrative intentions in cronological order) of a film (Colonel Redl) for both llama3 and Claude Sonet to organise and formulate them into an essay focused on the general narrative (how state oppression restricts individual autonomy and integrity). They've both done it, but noticed how llama3 crudely forced and simplified all details to suite the requested point, while Claude being much more subtle and careful.
I have no idea how could banchmarks be designed for such considerations though.
@@samwolfe1000 How do you mean "parameters"? Llama 3 would have a few sampler variables and temperature, but samplers aren't always exposed to an API. Beyond that system prompts would need to also be improved upon over Claude usage.
I haven't used Claude but have seen some output and it is very thorough. You would have to prompt that out of default llama3, and I haven't heard the greatest news on finetunes improving the base model. Perhaps try Command R?
Ah, yes. The next generation of human brains will indeed mimick AI, unfortunately.
"I don't mean to bore you with tech, commander...."