💣Corrections of Mistakes: 6:10 - The Big O run-time of a compute shader is not Constant! It's O(n/p). n refers to the number of work-item elements, which is a different variable from the number of processors (p). Of course p >= n can happen, but they are two distinct variables. In order for p to affect n's complexity, then p would have to scale with n's complexity in some way. p is just a very big constant (for a specific device). 2:19 - Computer Engineers didn't invent Big O. Mathematician Paul Bachmann did. 5:04 - I didn't explain how to write WGSL in this video. To learn a bit more about this language, take a look at my previous video. 5:52 - When you have a for loop, the big o run-time is O(n) only if the stuff that's happening in the loop is running in constant time. 12:52 - I'm sorry for my incompetence in writing rust. Maybe I shouldn't have used it in the video. 15:11 - Here I'm comparing Binary Tree implementation in WebGPU to DFS implementation in C++ and JS. I should've been more clear about this. 17:23 - Here in this screen shot, I forgot to use the WindTurbulenceForce which is being used in the compute shader. 19:08 - I lied. Lying is not good 🙂
You're also wrong that Javascript is "so damn slow". It's actually one of the fastest scripting languages in existence, and can approach C/C++ speed if you use WASM. It absolutely can be (and is) used for games, it's just not the absolute fastest solution compared to native and isn't used for AAA games. Some of the games you've bought on Steam were probably made with JS.
@@scornwell100 I'd have to disagree with you because I've worked on game projects that couldn't be implemented by purely using javascript. That's exactly why Wasm is so useful... ( but Wasm is not JS and JS is not Wasm )
@@visionary_3_d For game projects yeah it's slow but when used for its intended purpose of writing a website Javascript is incredible. The problem is that it's so easy to learn and incredible for websites with such strong support behind it squeezing every drop of perf out that people think it belongs in everything since "I know it already". It's very capable as a language, but just capable enough to be deceptive. Its speed can top WASM in some circumstances. DOM manipulation is margin of error at this point but anything string related will typically be faster in JS since most of the languages you write WASM in use UTF8 encoding which has to be turned into UTF16 when passed back to the browser. Even if JS never made the mistake of UTF16 and WASM had native DOM manipulation, JS is plenty fast for its use cases. JS speed hate is misdirected, the true speed criminal of the programming world is Python making people believe their computers are dozens of times slower than they are.
Javascript isn't slow, when compared to its peers. Its slow if you're comparing it to C/C++, Rust aka compiled languages and even java/C# compiled bytecode languages. But Javascript compared to python, perl, php, ruby, lua, aka interpreted languages, Javascript is literally the fastest. The issue is people like using Javascript for things that it shouldn't be used for, so it feels slow.
Using the "right" tool can make JS way faster, with Bun JS may be compare with C++ and Rust in many aspects and honestly I don't know if JS can be faster than lua or luajit
That’s an interesting take and for the most part I agree. The main problem is that I wanna build games on the web and JS is too slow for that in most cases.
@@visionary_3_d Have you seen Three.js? Cuz GPU computing already exists in js in form of WebGL (which three.js is using in the background) and which is really common for games on the web.
There is few things I need to point out in the video. - The time complexity is made to compute the time taken by an algorithm to run over N items. In this case of a simple algorithm such as addition of two arrays of N elements, the time complexity is O(N) both in CPU or using the shader. The algorithm is exactly the same in both case so usage of O notation doesn't make sense in this context. The difference in speed here is the number of core the GPU has. Basically we compare O(N) with something like O(N/512) or even more depending of your GPU chipset. Addition of two array is certainly not feasible in O(1). - The time to feed the buffer in GPU-land is bounded by bandwidth (PCI express bus) and in the case of transferring from Javascript, there is tons of overhead related to browser security concerns and the communication between Javascript and the deeper C/C++ layer, hence the time taken. Other than that, I think it's pretty good explanation.
Very interesting. I'm working with WebAssembly right now. One of the downsides of webassembly is that if the input size is large, the cost to compute will be too high for it to be worth it. It's best to have a smaller amount of parameters that result in a lot of computing to be worth it. Using wasm for encrypting data is really effective, as an example. So if WebGPU is powerful for computing large amounts of data, we can use javascript, webgpu and wasm for the following best cases: - Javascript: small amount of computing, or simple computing where the JIT compiler will speed it up for us; - Wasm: Expensive computing that we can't run in parallel (rust wasm supports async, but doesn't support multiple threads), lots of computing for a small amount of data; - WebGPU: Large amounts of data that can be computed in parallel;
I think the motorcycle/bus analogy is better at explaining the difference between GPU's and CPU's. A CPU is like a motorcycle, it can quickly take you and only you to your destination. If multiple people need to use the motorcycle, they'll have to wait for their turn, or you need to get more motorcycles (cores). A GPU is like a bus, it can take a large group of people from the same point A to the same point B at a bit of a slower speed. This works great if everyone is coming and going to/from the same places, since it'll be much faster than taking one person at a time with the motorcycle.
I've tested performance between js and wasm and there's an important note. Similar to sending data back and forth between CPU and the GPU, there's a high cost to sending data back and forth between the wasm worker and the main thread that needs to be accounted for. Calling wasm just to perform one operation and then read back it's usualy not worth it, or even worse than doing that in plain js. The system should idealy be designed to make the best use of the resources it has available.
Ok, was first triggered by the “so darn slow” (the javascript multi stage JIT compiler is a piece of marvel)… and then I saw the sub-title on 1:09. Well played. Great video!
Excellent video! Loved the pacing and the amount of info you provided in this one. Enough to make people excited and start exploring the concepts on their own. Appreciate the effort! Thank you.🙏
You should try Zig, especially in a WASM context! If you're coming from C++ you'll find it a lot easier than Rust. It's also very easy to set up a fixed-size pre-allocated buffer for the fastest possible communication between WASM and JS, and won't have borrow checker gymnastics standing in your way.
My first suspicion on the Rust code is you're using for loops with safe array access, so every index access is checked. Basically it's like using .at(x) in C++ rather than [x]. The mapreduce functions are intended to be the main way to access arrays safely, since they don't need to do the index check on every read/write. There's an unsafe access too, which is the same as the C++ [x] (with all the out of bounds memory access risks involved).
I'm blown away. Not by WebGPU (although that's pretty incredible too), but by how amazing this video is. The pace at which you give information was on point, animations were pretty cool and I can't believe you took the time to implement these algorithms in wasm and webgpu! Keep it up man, loved this video and I'm excited to see your future ♥
Awesome video! Could you explain how you measured the time for GPU compute without including the buffer copy / how you were able to measure the time of the buffer copy and subtract it?
Yeah sure. I'm using a special flag in chrome which allows me to use some timer tools. --enable-dawn-features=allow_unsafe_apis here's the flag. you can learn more about this here: matrix.to/#/!MFogdGJfnZLrDmgkBN:matrix.org/$D342y3gRpy3EndUMK1nzbL7Qwvn6wozusB6wCROTuZs?via=matrix.org&via=mozilla.org&via=matrix.nrp-nautilus.io
The interesting comparison would be, how much slowdown the webgpu abstraction introduces compared to a C++ cuda implementation. Rather than comparing CPU vs GPU Code.
I wonder if the Rust and C++ versions of your code that were compiled to WASM were using SIMD where possible. Sometimes the compiler can detect and optimize for this, but it often only works for simple loops.
@@visionary_3_d According to the WebAssembly roadmap, fixed-width SIMD is supported (for WebAssembly) by all major browsers and WASM runtimes. It is just not available in JavaScript. Since the SIMD support is for 128-bit operations and you were using f32, this could potentially provide a 4x performance boost to both the Rust and C++ versions, provided that the algorithm can be vectorized.
@@KPHIBYE huh. You're actually right. I must've looked at some old website when I was preparing my video. I would've loved to include SIMD too. Thanks for the suggestion. I'll use it in future videos...
It is impossible to do all the additions in parallel for any big enough number of additions. You are limited by the number of compute units in the GPU. Also it is very hardware dependent. You are better to use a 64 core Threadripper with a parallel algorithm (if you have it) than a GTX 1030 for the perlin noise :) Also you have an error in the O notation for 12:37 - if n is the size than that is O(n^2) since there are two loops to size one inside another. P.S. You are right about Rust. It can be as fast as good C in the best case, but if you have the best C code Rust cannot match it.
You're right about the limitation of the GPU. And yeah, rust can be fast ( if you know how to make it fast ), but it's easier to get it right with Cpp the first time. For me at least.
19:08 Lying doesn't inspire people. Not for long anyway. The compute shaders of WebGPU might be great for simulations, machine learning and other math-heavy 3D operations, but 99.9% of the code is sequential for a reason, and introducing yet another language only makes code maintenance worse.
I wish we just got access to the GPU memory and compile programs into it and use the hardware directly. Not sure how achievable that is though. There might be compatibility issues, mainly across different hardware vendors
That's pretty much exactly what happens, but we're missing some features with current WebGL. Right now you have to attach a vertex + fragment shader, no option for compute. You can still workaround by passing data through the vertex shader into fragment, and doing some hacky stuff with reading to/from textures or buffers to compute things.
@@nexovec it's pretty much limited to vertex + fragment shaders, which is why compute shaders would be nice. It's not necessary though, you can use fragment shaders to compute things but it's hacky. Someone made a mandelbrot set in WebGL and it still runs really well
@@Xld3beats I'm pretty sure you can even emulate the geometry shader in the vertex shader too, but all this sounds really hackish. I really feel like hardware acceleration is going to be a large pain point for browsers one day, as computers get more and more dedicated hardware for AI, media, network and other processing, because I can't see the GPU web support trend changing.
@@nexovec WebGPU and WASM are in the works and should allow us to build much faster apps. Web-based apps are pretty nice since they work on nearly every device, like WebGL is supported on about 98% of devices
Wow this was a great video, really informative, good job! The visuals were on point as well. I would be interested in a beginner's guide to webgpu as well, I find the syntax a bit confusing, if you're looking for video ideas ;)
I'm confused as to the target audience, this we well made and informative but it had so much filler I found myself flicking through it. I can't understand why someone who codes and knows what webgpu is enough to click on this would like being talked down to so much.
Interesting... For my target audience I was imagining people who have heard about WebGPU but have no idea how to make it work and why it can be faster in some cases. That's about it.
@@visionary_3_d sorry, upon reflection that was harshly stated, my apologies. But I honestly didn't mean it harshly. I'm terrible at communicating with the written word. Apparently even writing fully cogent sentences troubles me.
Thanks for the reply, I felt like my comment was jarring enough to warrant a better response. So I'm re-watching it, and I'll give more pointed feedback. I found it jarring to get a full explanation of how to time a function and then a brief mention of big-O notation, where both of them could be a bit more brief. This kinda reminds me of a "bike-shedding" moment, where someone knows a lot of about something, but it's a minor part of the larger bit of your project. Then the specifics so @3:57 the statement on cost, in a hand wavy way is true but only in the time dimension, in the big-O dimension it's actually the same.big-O does not care how many core you can run something on. It's time cost went down, but at a expense of computer power Then after a bunch of technical concepts you use an analogy of a portal to transmit data to the gpu. Some of your choices are entirely stylistic and that's fine! (we don't all have to be the target audience) but some of the above made it difficult to watch. The particle demo at the end was nicely done, superbly technical and well spoken, it's something you have passion about. And it's obvious! But it also had the right amount of conversation, technical explanation and practical demonstration. Anyway, now you got me to watch it twice, ha! I hope this helps!
I'd be curious to know why your rust code ended up slower than JS, I'm not an expert rust developer but I'm reasonably confident in the language and compared to almost identical C++ code I wrote it was always only slightly slower. Nothing significant to the point of being slower than JS. My big hiccup with it was trying to write Rust like C++ at first as in school I learnt Haskell and was told it was a "functional programming language" so when hearing Rust was the same I despised the style at first. Rust tends to optimise better when you write as functionally as possible. I definitely agree with you that writing some mathematical algorithms in Rust can be rough though, as its borrow checker rules tend to get in the way more than they help. I'm sure if I knew more unsafe rust beyond just NonNull I could get around this but in the realm of writing something mathematical I think C and C++ will keep their top spot (someone feel free to comment about Zig). It does seem like an odd comparison to make though as Rust and C++ are both LLVM based so will tend to be within the same order of magnitude with the differences being caused primarily by Rust's safety checks preventing some compiler optimisations combined with C++'s frankly better optimisation. I feel C# might've been a better bench alongside these with the focus of the video on running code on the GPU just since C# is both a well known language (everyone knows of Rust, few people write it, nobody writes it for a living) and C# is commonly used in game dev (Unity, to a lesser extend Godot). It also I believe has web app relevance in the form of Blazor although I've never touched it myself.
(Everyone knows rust, few people write it and nobody writes it for a living.) 😂 Thanks for suggesting Zig and C#. I’m interested in trying those out. I think an expert on Rust can def write faster code than me… But in my comparison the algorithms are the same. So I can imagine a scenario where Rust is faster than C++ when you optimize your Rust code.
@@visionary_3_d It can be faster but it's certainly rarer. C++'s assumptions about things like UB can lead to it making more aggressive optimisations that Rust just can't. Ultimately unless you're doing high speed trading or are in some other environ where nanoseconds matter they'll tend to be similar enough that choosing between them is a matter of which you prefer or know best. This is just me nitpicking on a language I like though, great video. Does a good job of making WebGPU exciting but also making expectations for it realistic. Another interesting upside on WebGPU is how much easier it is to get started with than (for example) CUDA. It just makes GPU accelerating your algorithms so much more approachable with less horrid dependency management, namely CUDA toolkit breaking or updating constantly. Even if you're not writing code to run on the web it just provides such an easy way of interacting with the GPU it's great.
pretty cool video, did have a question for the simple addition example. at 7:15, could you try using a shared array buffer instead to skip copying data over?
Isn't the simple addition example still O(n) since you are still technically limited by the amount of operations you can do in parallel. Sure its much faster but the O-Notation is unaffected since its O(n / C) where C is a constant depending on the GPU.
You're 100% right. Since I wanted to make this video beginner friendly I skipped this detail. It's much easier to think everything happens at the same time and write code that way. But now that you mentioned this, I'll put this in the mistakes list...
Nice video! And thank you for all your efforts, but you are not comparing apples to apples. A better comparison in my opinion would be JS & webGPU vs C++ & Vulkan
Ive been waiting for WebGPU for a while now! Seeing the physics sims coming out of BabylonJS. Going to beat the hell out of passing a 1xN sized texture buffer to some differed pass in WebGL to mimic a compute shader! Is Rust worth learning? Cause I went JS/TS -> C++ WASM, haven't really felt limited at all, besides transfer latency.
JS/TS -> C++ WASM... That's exactly what I use. Rust is definitely worth learning. It shows you what a really powerful programming language looks like ( I'm in love with the macro system of Rust ). However I don't use Rust in my projects for now... Only for having a little bit of fun.
@@visionary_3_d Macros seem pretty cool, gonna look into this more, thanks! Kinda seem like compiler directives which can act like partials/lambdas without the fuss, unless I'm off base? Jeez, thinking about how many times I've needed to write overflow functions in the past...
People who are saying javascript is slow in browser are wrong not event cuz they think javascript is slow, but cuz they think programming language is a bottleneck. No. It's the rendering engine of browser. Style computations, layouts, DOM updates! If you'll try to draw a page inside a by hand it'll be faster than dom even without extra sorcery.
Thanks for the comment. You are right about DOM updates... I used to write really complex math algorithms in javascript and I realized that I'd get a massive speed up if I use WASM instead... So that's what I was talking about in the video if it's not clear.
C++ library/API have memory problem during webassembly runtime (out of bound etc), if you want to use library other than C standard like ffmpeg, use C version
How does all this work on an APU today? Will the speeds be different? Both CPUs and GPUs have evolved a lot since conception, but GPUs have become monsters that occupy a lot of space and use lots of power. Im pretty sure in the future we will go back to using only one processor that integrates the best parts of CPUs and GPUs and will run some chimera language that delegates processes accordingly and more efficiently. But right now is a "pick your own tools" situation.
I don’t think there’s anything flawed to the method of: getting the current time (t1) -> doing some computation -> getting the current time (t2) -> subtract t1 to t2, to get measure performance. The flaw here is that you introduced a new computer into the mix out of nowhere lol
I like the presentation and want to commend the effort and polish put into this video, but like other commenters, I feel the need to point out that there are a some problems. Primarily, the Big-O notation is being misused pretty heavily, but also please never ever present misleading benchmarks without thoroughly investigating any similarities/differences they show (like saying that you're not sure your Rust code is good, or that you're really confident in your C++ code, and then just putting those results head-to-head). That said, keep up creating content!
Thanks. I also agree that this video needed more robust fact checking stage which could've prevented some of the problems that you mentioned. I am going to keep refining my video creation process, and your comment is gonna help me do that. Thanks for writing this for me.
This isn't JS at blazing speed, it's WASM so the title is a bit misleading. But yeah, all compute intensive stuff can finally switch to a more performant environment.
@@krikusZ80 There's major benefits from just using WASM for most things (e.g. Lichess uses it for running stockfish online), some problems can't or aren't worth solving using the GPU (via WGPU). That was also possible in the past through WebGL, though somewhat harder to model as (I think) it doesn't support compute shaders.
0:05 skill issue. You can at the very least rewrite standard javascript multiplication as an "instruction" loop of bitwise operations and that alone will speed up important game stuff like matrix multiplication. Imagine having to perform n^3 multiplications but now every multiplication is 20 times faster. That'd be great, much better than calling recursion for weird "faster" algorithms that remove only 1-2 multiplications in the end.
1) Running the same algorithm on different Hardware doesn't change it's Big O runtime. 2) copying data to gpu or webassembly worker takes O(n) so with this approach shouldn't accerate any function that needs O(n) anyway.
I think I made number 1 clear in the video. about 2, I'm talking purely about the computation in the video ( in isolation ). But yes, ultimately what you're saying will happen in most cases 👍
Vertex and Fragment shaders are used for rendering and not compute. You can hack fragment shaders and get them to do compute work but you can’t return the data to the cpu as easily and it’s def much more difficult to get it to work with some algorithms. Compute Shaders are much more flexible. Now, let me note that Three.js and Babylon both now have Compute Shaders support but for Three.js there is little to no documentation at this time.
JS IS DEFINITELY NOT SLOW. JS is the fastest dynamic language there is, not only to run, but specially for implementing any type of algorithms, because it's an extremely expressive language, and it's the internet overlord. There's almost 30 years of optimizations making it run even faster then C++ in some scenarios. Google's V8 engine actually replaced almost 30% of its C++ code to JS to enjoy it's optimizations. You even reached the same conclusion in the video, running the same algorithm in JS faster than Rust. Even before all of the JS optimizations we have today, there was already 3D games running in JS, even before WebGL, that is the predecessor of WebGPU.
For the GPU image, showing an external graphics card is kind of becoming passé. I type this on an iPad. Has GPU. Benefits from WebGPU. But no physical card. Alternate image: highlight GPU cores of any modern APU chip.
I agree with you that it works great for "many" games but remember that we don't have a single FPS game that's actually active on the web.
Год назад
You are right. But the reason for that is not JS/WebGL performance in my opinion, but browser limitations on how much memory webapps are allowed to use and store.@@visionary_3_d
Let Rust for Application Code or the cool kids out there. Core Components, Library Code, or any heavy-lifting etc.. - that's C++ Land, and it's gonna take a while till something beats that. All hail C++❤
@theintjengineer the one who made this video knows C++ very well but seems to just be learning Rust. To compare performance, it would be better if someone who knows Rust well write and then compare.
@@phantombeing3015rust is fast and safe but being safe usually incurs a performance cost. C++ can optimise more aggressively than rust because it has undefined behaviour, but rust, ensures it's safe and has defined behaviour.
Compiling Rust to WASM is not at all a fair comparison with JavaScript since V8 runs that with no overhead. I don't know what the C++ compiler output, but this should be telling you something about the state of WASM and not of Rust.
So we could have JS-miners installed along with ads? Finally! Nothing goog will ever come out of any ecma. JS-programmers are being referred as modern lumpenproletariat for a reason.
JS is not slow. Idk why people stick in 1999? JS is like half the speed of the similar c++ code, if the code is well optimized for JS engines (using typescript solve most of the problems).
@@visionary_3_d thanks for spending a lot amount of time for content for us! it’s so inspiring and after your video I love my C course much much more 🙂
I did try to make rust as fast as possible. But something to consider is that wasm-opt doesn’t work with wasm bindgen. So maybe I should’ve used a better compiler.
This is great, I just don't get the WebGPU name. It's still a WebGL, a graphics library. I suppose this is a totally new project and you guys had other ideas or for whatever reason could not work with the WebGL folks, at least please name it something else, that makes sense. I don't know, WebGL-Next? Anything but this. WebGPU is kinda stupid for the awesome work you put in there. Thanks!!!
You must be doing something wrong with Rust, because it is not slower than C++ and certainly not slower than JavaScript. Maybe you are running an unoptimized debug build of your code, which can easily be 10x slower than an optimized release build. Or maybe you're implementing the algorithms in an inefficient way in Rust (for example, copying data unnecessarily when passing it to functions).
The rust code is available on github ( link in the description ). in my cargo.toml file: [profile.release] lto = true opt-level = 3 --------------- let me add that you may be right! I'm not that experienced with Rust. However I haven't been able to find why Rust is slow in this case.
Theme is interesting but format is not, having everything you say displayed on slides is not very fun to follow (one can get the transcript for this), way too long vid than it should
💣Corrections of Mistakes:
6:10 - The Big O run-time of a compute shader is not Constant!
It's O(n/p). n refers to the number of work-item elements, which is a different variable from the number of processors (p).
Of course p >= n can happen, but they are two distinct variables.
In order for p to affect n's complexity, then p would have to scale with n's complexity in some way. p is just a very big constant (for a specific device).
2:19 - Computer Engineers didn't invent Big O. Mathematician Paul Bachmann did.
5:04 - I didn't explain how to write WGSL in this video. To learn a bit more about this language, take a look at my previous video.
5:52 - When you have a for loop, the big o run-time is O(n) only if the stuff that's happening in the loop is running in constant time.
12:52 - I'm sorry for my incompetence in writing rust. Maybe I shouldn't have used it in the video.
15:11 - Here I'm comparing Binary Tree implementation in WebGPU to DFS implementation in C++ and JS. I should've been more clear about this.
17:23 - Here in this screen shot, I forgot to use the WindTurbulenceForce which is being used in the compute shader.
19:08 - I lied. Lying is not good 🙂
You're also wrong that Javascript is "so damn slow". It's actually one of the fastest scripting languages in existence, and can approach C/C++ speed if you use WASM. It absolutely can be (and is) used for games, it's just not the absolute fastest solution compared to native and isn't used for AAA games. Some of the games you've bought on Steam were probably made with JS.
@@scornwell100cope
@@scornwell100 well it's all about the compiler and runtime, like js with bun is way faster than nodejs
@@scornwell100 I'd have to disagree with you because I've worked on game projects that couldn't be implemented by purely using javascript. That's exactly why Wasm is so useful... ( but Wasm is not JS and JS is not Wasm )
@@visionary_3_d For game projects yeah it's slow but when used for its intended purpose of writing a website Javascript is incredible. The problem is that it's so easy to learn and incredible for websites with such strong support behind it squeezing every drop of perf out that people think it belongs in everything since "I know it already". It's very capable as a language, but just capable enough to be deceptive.
Its speed can top WASM in some circumstances. DOM manipulation is margin of error at this point but anything string related will typically be faster in JS since most of the languages you write WASM in use UTF8 encoding which has to be turned into UTF16 when passed back to the browser. Even if JS never made the mistake of UTF16 and WASM had native DOM manipulation, JS is plenty fast for its use cases.
JS speed hate is misdirected, the true speed criminal of the programming world is Python making people believe their computers are dozens of times slower than they are.
Javascript isn't slow, when compared to its peers. Its slow if you're comparing it to C/C++, Rust aka compiled languages and even java/C# compiled bytecode languages. But Javascript compared to python, perl, php, ruby, lua, aka interpreted languages, Javascript is literally the fastest. The issue is people like using Javascript for things that it shouldn't be used for, so it feels slow.
Until Hermes Static 🤩
Java_script_ is slow in 99% of browser
Java is fast, JS is not
Using the "right" tool can make JS way faster, with Bun JS may be compare with C++ and Rust in many aspects and honestly I don't know if JS can be faster than lua or luajit
That’s an interesting take and for the most part I agree.
The main problem is that I wanna build games on the web and JS is too slow for that in most cases.
@@visionary_3_d Have you seen Three.js? Cuz GPU computing already exists in js in form of WebGL (which three.js is using in the background) and which is really common for games on the web.
There is few things I need to point out in the video.
- The time complexity is made to compute the time taken by an algorithm to run over N items. In this case of a simple algorithm such as addition of two arrays of N elements, the time complexity is O(N) both in CPU or using the shader. The algorithm is exactly the same in both case so usage of O notation doesn't make sense in this context.
The difference in speed here is the number of core the GPU has. Basically we compare O(N) with something like O(N/512) or even more depending of your GPU chipset.
Addition of two array is certainly not feasible in O(1).
- The time to feed the buffer in GPU-land is bounded by bandwidth (PCI express bus) and in the case of transferring from Javascript, there is tons of overhead related to browser security concerns and the communication between Javascript and the deeper C/C++ layer, hence the time taken.
Other than that, I think it's pretty good explanation.
Yess sir. Look at corrections in the pinned comment.
And also thanks for taking the time to explain all of this ♥
Finally, a video on the internet, that talks about C++ and Rust, and doesn't slander C++, cheers brother.
I think C++ is great 😄😄
@@visionary_3_d So do I, but I think I might be a tad bit biased considering I do it for living.
I mean the part where he shows how Rust is slower than JavaScript for math operations should have made you quite suspicious.
Very interesting. I'm working with WebAssembly right now. One of the downsides of webassembly is that if the input size is large, the cost to compute will be too high for it to be worth it. It's best to have a smaller amount of parameters that result in a lot of computing to be worth it. Using wasm for encrypting data is really effective, as an example.
So if WebGPU is powerful for computing large amounts of data, we can use javascript, webgpu and wasm for the following best cases:
- Javascript: small amount of computing, or simple computing where the JIT compiler will speed it up for us;
- Wasm: Expensive computing that we can't run in parallel (rust wasm supports async, but doesn't support multiple threads), lots of computing for a small amount of data;
- WebGPU: Large amounts of data that can be computed in parallel;
That’s the perfect conclusion IMO 👌👌
I think the motorcycle/bus analogy is better at explaining the difference between GPU's and CPU's. A CPU is like a motorcycle, it can quickly take you and only you to your destination. If multiple people need to use the motorcycle, they'll have to wait for their turn, or you need to get more motorcycles (cores).
A GPU is like a bus, it can take a large group of people from the same point A to the same point B at a bit of a slower speed. This works great if everyone is coming and going to/from the same places, since it'll be much faster than taking one person at a time with the motorcycle.
I actually didn't know about this analogy.
That's actually very interesting. Thanks for sharing ❤
I've tested performance between js and wasm and there's an important note. Similar to sending data back and forth between CPU and the GPU, there's a high cost to sending data back and forth between the wasm worker and the main thread that needs to be accounted for.
Calling wasm just to perform one operation and then read back it's usualy not worth it, or even worse than doing that in plain js.
The system should idealy be designed to make the best use of the resources it has available.
I agree with you.
A developer should use these technologies carefully and wisely.
Ok, was first triggered by the “so darn slow” (the javascript multi stage JIT compiler is a piece of marvel)… and then I saw the sub-title on 1:09. Well played. Great video!
Fair enough! 😆
Excellent video! Loved the pacing and the amount of info you provided in this one. Enough to make people excited and start exploring the concepts on their own.
Appreciate the effort! Thank you.🙏
I appreciate your kind words. 🫡😇
You should try Zig, especially in a WASM context! If you're coming from C++ you'll find it a lot easier than Rust. It's also very easy to set up a fixed-size pre-allocated buffer for the fastest possible communication between WASM and JS, and won't have borrow checker gymnastics standing in your way.
I’ll definitely try it 👍
Thanks for the suggestion.
Good point. Bun javascript runtime (Node alternative) was written in Zig.
Although I couldn't grasp quite a bit of this vedio since I am a novice dev . The quality of the vedio is superb . Hope to see you grow .
Thanks for the kind words ❤
My first suspicion on the Rust code is you're using for loops with safe array access, so every index access is checked. Basically it's like using .at(x) in C++ rather than [x]. The mapreduce functions are intended to be the main way to access arrays safely, since they don't need to do the index check on every read/write. There's an unsafe access too, which is the same as the C++ [x] (with all the out of bounds memory access risks involved).
He's gonna summon @ThePrimeagen saying things like "Blazingly Fast" 😂
😂😂
Already sent this one to r/ThePrimeagenReact. Now we wait if he does.
I'm blown away. Not by WebGPU (although that's pretty incredible too), but by how amazing this video is. The pace at which you give information was on point, animations were pretty cool and I can't believe you took the time to implement these algorithms in wasm and webgpu! Keep it up man, loved this video and I'm excited to see your future ♥
Thanks for the kind words.
What you said is what gives me motivation to continue doing this work.
Thank you ❤️
It's not yet compatible with mobile browsers, but here's hoping the day is coming soon
Awesome video!
Could you explain how you measured the time for GPU compute without including the buffer copy / how you were able to measure the time of the buffer copy and subtract it?
Yeah sure.
I'm using a special flag in chrome which allows me to use some timer tools.
--enable-dawn-features=allow_unsafe_apis
here's the flag.
you can learn more about this here:
matrix.to/#/!MFogdGJfnZLrDmgkBN:matrix.org/$D342y3gRpy3EndUMK1nzbL7Qwvn6wozusB6wCROTuZs?via=matrix.org&via=mozilla.org&via=matrix.nrp-nautilus.io
The interesting comparison would be, how much slowdown the webgpu abstraction introduces compared to a C++ cuda implementation. Rather than comparing CPU vs GPU Code.
Great idea for the next episodes…
CUDA C++ is just way too good haha.
I know 0 about WebGPU, but beating CUDA C++ wouldn't be that easy.
I wonder if the Rust and C++ versions of your code that were compiled to WASM were using SIMD where possible. Sometimes the compiler can detect and optimize for this, but it often only works for simple loops.
SIMD isn't supported in web browsers. Yet...
@@visionary_3_d According to the WebAssembly roadmap, fixed-width SIMD is supported (for WebAssembly) by all major browsers and WASM runtimes. It is just not available in JavaScript. Since the SIMD support is for 128-bit operations and you were using f32, this could potentially provide a 4x performance boost to both the Rust and C++ versions, provided that the algorithm can be vectorized.
@@KPHIBYE huh. You're actually right.
I must've looked at some old website when I was preparing my video. I would've loved to include SIMD too.
Thanks for the suggestion.
I'll use it in future videos...
It is impossible to do all the additions in parallel for any big enough number of additions. You are limited by the number of compute units in the GPU.
Also it is very hardware dependent. You are better to use a 64 core Threadripper with a parallel algorithm (if you have it) than a GTX 1030 for the perlin noise :)
Also you have an error in the O notation for 12:37 - if n is the size than that is O(n^2) since there are two loops to size one inside another.
P.S. You are right about Rust. It can be as fast as good C in the best case, but if you have the best C code Rust cannot match it.
You're right about the limitation of the GPU.
And yeah, rust can be fast ( if you know how to make it fast ), but it's easier to get it right with Cpp the first time.
For me at least.
19:08 Lying doesn't inspire people. Not for long anyway. The compute shaders of WebGPU might be great for simulations, machine learning and other math-heavy 3D operations, but 99.9% of the code is sequential for a reason, and introducing yet another language only makes code maintenance worse.
1:53 That subtle flex tho
Glad somebody noticed this 🤣🤣
I wish we just got access to the GPU memory and compile programs into it and use the hardware directly. Not sure how achievable that is though. There might be compatibility issues, mainly across different hardware vendors
That's pretty much exactly what happens, but we're missing some features with current WebGL. Right now you have to attach a vertex + fragment shader, no option for compute. You can still workaround by passing data through the vertex shader into fragment, and doing some hacky stuff with reading to/from textures or buffers to compute things.
@@Xld3beats What about the rest of the pipeline, like geometry or fragment shaders? (I haven't touched graphics for a long time now)
@@nexovec it's pretty much limited to vertex + fragment shaders, which is why compute shaders would be nice. It's not necessary though, you can use fragment shaders to compute things but it's hacky. Someone made a mandelbrot set in WebGL and it still runs really well
@@Xld3beats I'm pretty sure you can even emulate the geometry shader in the vertex shader too, but all this sounds really hackish.
I really feel like hardware acceleration is going to be a large pain point for browsers one day, as computers get more and more dedicated hardware for AI, media, network and other processing, because I can't see the GPU web support trend changing.
@@nexovec WebGPU and WASM are in the works and should allow us to build much faster apps. Web-based apps are pretty nice since they work on nearly every device, like WebGL is supported on about 98% of devices
Really good anumations and basicaialy video editing. Althought... its just using gpu... you can do it already in I think vulkan
Thanks!
Not on the web though.
Even if this doesn’t make 5k, please make more!
haha I will...
Thanks for watching!
1:56 man just straight up roasted my pc out of nowhere
🤣🤣
thanks for this video. btw, can you explain the difference between webgpu and webassembly?
Wow this was a great video, really informative, good job! The visuals were on point as well. I would be interested in a beginner's guide to webgpu as well, I find the syntax a bit confusing, if you're looking for video ideas ;)
NVM I found you previous video
Thank Youuu ❤️
I'm confused as to the target audience, this we well made and informative but it had so much filler I found myself flicking through it. I can't understand why someone who codes and knows what webgpu is enough to click on this would like being talked down to so much.
Interesting...
For my target audience I was imagining people who have heard about WebGPU but have no idea how to make it work and why it can be faster in some cases.
That's about it.
@@visionary_3_d sorry, upon reflection that was harshly stated, my apologies. But I honestly didn't mean it harshly. I'm terrible at communicating with the written word. Apparently even writing fully cogent sentences troubles me.
Thanks for the reply, I felt like my comment was jarring enough to warrant a better response. So I'm re-watching it, and I'll give more pointed feedback.
I found it jarring to get a full explanation of how to time a function and then a brief mention of big-O notation, where both of them could be a bit more brief. This kinda reminds me of a "bike-shedding" moment, where someone knows a lot of about something, but it's a minor part of the larger bit of your project.
Then the specifics so @3:57 the statement on cost, in a hand wavy way is true but only in the time dimension, in the big-O dimension it's actually the same.big-O does not care how many core you can run something on. It's time cost went down, but at a expense of computer power
Then after a bunch of technical concepts you use an analogy of a portal to transmit data to the gpu.
Some of your choices are entirely stylistic and that's fine! (we don't all have to be the target audience) but some of the above made it difficult to watch.
The particle demo at the end was nicely done, superbly technical and well spoken, it's something you have passion about. And it's obvious! But it also had the right amount of conversation, technical explanation and practical demonstration.
Anyway, now you got me to watch it twice, ha! I hope this helps!
For the Rust code, are you building a release version of the app? The debug build is usually 20 times slower.
Yeah I'm using the release version ( --release ) with wasm-opt and maximum opt-level.
@@visionary_3_dNice okay. I just know I made that mistake for my first project 😂😂
@@davidt01 No problem. Thanks for letting me know.
I'd be curious to know why your rust code ended up slower than JS, I'm not an expert rust developer but I'm reasonably confident in the language and compared to almost identical C++ code I wrote it was always only slightly slower. Nothing significant to the point of being slower than JS. My big hiccup with it was trying to write Rust like C++ at first as in school I learnt Haskell and was told it was a "functional programming language" so when hearing Rust was the same I despised the style at first. Rust tends to optimise better when you write as functionally as possible.
I definitely agree with you that writing some mathematical algorithms in Rust can be rough though, as its borrow checker rules tend to get in the way more than they help. I'm sure if I knew more unsafe rust beyond just NonNull I could get around this but in the realm of writing something mathematical I think C and C++ will keep their top spot (someone feel free to comment about Zig).
It does seem like an odd comparison to make though as Rust and C++ are both LLVM based so will tend to be within the same order of magnitude with the differences being caused primarily by Rust's safety checks preventing some compiler optimisations combined with C++'s frankly better optimisation. I feel C# might've been a better bench alongside these with the focus of the video on running code on the GPU just since C# is both a well known language (everyone knows of Rust, few people write it, nobody writes it for a living) and C# is commonly used in game dev (Unity, to a lesser extend Godot). It also I believe has web app relevance in the form of Blazor although I've never touched it myself.
(Everyone knows rust, few people write it and nobody writes it for a living.) 😂
Thanks for suggesting Zig and C#. I’m interested in trying those out.
I think an expert on Rust can def write faster code than me…
But in my comparison the algorithms are the same. So I can imagine a scenario where Rust is faster than C++ when you optimize your Rust code.
@@visionary_3_d It can be faster but it's certainly rarer. C++'s assumptions about things like UB can lead to it making more aggressive optimisations that Rust just can't.
Ultimately unless you're doing high speed trading or are in some other environ where nanoseconds matter they'll tend to be similar enough that choosing between them is a matter of which you prefer or know best.
This is just me nitpicking on a language I like though, great video. Does a good job of making WebGPU exciting but also making expectations for it realistic.
Another interesting upside on WebGPU is how much easier it is to get started with than (for example) CUDA. It just makes GPU accelerating your algorithms so much more approachable with less horrid dependency management, namely CUDA toolkit breaking or updating constantly. Even if you're not writing code to run on the web it just provides such an easy way of interacting with the GPU it's great.
Compiling C++ with emscripten becomes painful when the external dependencies become complex.
Yep.
+Long compile times...
pretty cool video, did have a question for the simple addition example. at 7:15, could you try using a shared array buffer instead to skip copying data over?
Isn't the simple addition example still O(n) since you are still technically limited by the amount of operations you can do in parallel. Sure its much faster but the O-Notation is unaffected since its O(n / C) where C is a constant depending on the GPU.
You're 100% right.
Since I wanted to make this video beginner friendly I skipped this detail.
It's much easier to think everything happens at the same time and write code that way.
But now that you mentioned this, I'll put this in the mistakes list...
Nice video! And thank you for all your efforts, but you are not comparing apples to apples. A better comparison in my opinion would be JS & webGPU vs C++ & Vulkan
Great idea for my next videos… 👌👌
i think you including rust in this video is amazing, i feel much better because i tried it too and couldnt do anything, i was feeling dumb
Man this is an amazing content. Thank you
Ive been waiting for WebGPU for a while now! Seeing the physics sims coming out of BabylonJS.
Going to beat the hell out of passing a 1xN sized texture buffer to some differed pass in WebGL to mimic a compute shader!
Is Rust worth learning?
Cause I went JS/TS -> C++ WASM, haven't really felt limited at all, besides transfer latency.
JS/TS -> C++ WASM...
That's exactly what I use.
Rust is definitely worth learning.
It shows you what a really powerful programming language looks like ( I'm in love with the macro system of Rust ).
However I don't use Rust in my projects for now...
Only for having a little bit of fun.
@@visionary_3_d Macros seem pretty cool, gonna look into this more, thanks!
Kinda seem like compiler directives which can act like partials/lambdas without the fuss, unless I'm off base?
Jeez, thinking about how many times I've needed to write overflow functions in the past...
People who are saying javascript is slow in browser are wrong not event cuz they think javascript is slow, but cuz they think programming language is a bottleneck. No. It's the rendering engine of browser. Style computations, layouts, DOM updates! If you'll try to draw a page inside a by hand it'll be faster than dom even without extra sorcery.
Thanks for the comment.
You are right about DOM updates...
I used to write really complex math algorithms in javascript and I realized that I'd get a massive speed up if I use WASM instead...
So that's what I was talking about in the video if it's not clear.
Great video. Quality studf
Thanks 🙏
C++ library/API have memory problem during webassembly runtime (out of bound etc), if you want to use library other than C standard like ffmpeg, use C version
next: what is "atomic"
Fantastic suggestion...
Love it ♥
Try using web worker pool while writing wasm next time.
How does all this work on an APU today? Will the speeds be different?
Both CPUs and GPUs have evolved a lot since conception, but GPUs have become monsters that occupy a lot of space and use lots of power. Im pretty sure in the future we will go back to using only one processor that integrates the best parts of CPUs and GPUs and will run some chimera language that delegates processes accordingly and more efficiently. But right now is a "pick your own tools" situation.
What is your hardware spec for 160fps ?
I don’t think there’s anything flawed to the method of: getting the current time (t1) -> doing some computation -> getting the current time (t2) -> subtract t1 to t2, to get measure performance. The flaw here is that you introduced a new computer into the mix out of nowhere lol
Yes. That is a good method for measuring performance. I think in combination with Big O you’ll get much better results.
i wanna make 2 transparent triangles intersect each other, is it possible in webgpu??
I like the presentation and want to commend the effort and polish put into this video, but like other commenters, I feel the need to point out that there are a some problems. Primarily, the Big-O notation is being misused pretty heavily, but also please never ever present misleading benchmarks without thoroughly investigating any similarities/differences they show (like saying that you're not sure your Rust code is good, or that you're really confident in your C++ code, and then just putting those results head-to-head). That said, keep up creating content!
Thanks. I also agree that this video needed more robust fact checking stage which could've prevented some of the problems that you mentioned.
I am going to keep refining my video creation process, and your comment is gonna help me do that.
Thanks for writing this for me.
love your work brother
Glad to hear it.
Well done 🎉 Subbed!
Thanks for the tutorial. Is it possible to share the three.js code that generates the animation of the particle system as shown @16:30 of the video?
I'm planning to do that at some point in the future... ( with its own video ofc )
well done bruv!
Hope this thing is adopted into game engines like Phaser or Cocos Creator
This isn't JS at blazing speed, it's WASM so the title is a bit misleading. But yeah, all compute intensive stuff can finally switch to a more performant environment.
That's right 👍
Not WASM, but WGLS i believe *
@@krikusZ80 There's major benefits from just using WASM for most things (e.g. Lichess uses it for running stockfish online), some problems can't or aren't worth solving using the GPU (via WGPU). That was also possible in the past through WebGL, though somewhat harder to model as (I think) it doesn't support compute shaders.
0:05 skill issue.
You can at the very least rewrite standard javascript multiplication as an "instruction" loop of bitwise operations and that alone will speed up important game stuff like matrix multiplication. Imagine having to perform n^3 multiplications but now every multiplication is 20 times faster. That'd be great, much better than calling recursion for weird "faster" algorithms that remove only 1-2 multiplications in the end.
Can they utilize multigpu? seem like only 1 gpu were used on my system.
Hi.
I don't think it's possible at the moment.
I found this issue on the topic:
github.com/gpuweb/gpuweb/issues/995
Very good and informative video! I would love to see another one even if this doesn't get 5k likes 😳👉👈 (which I'm sure it will)!
Haha sure. Thanks for the kind words 🔥
1) Running the same algorithm on different Hardware doesn't change it's Big O runtime.
2) copying data to gpu or webassembly worker takes O(n) so with this approach shouldn't accerate any function that needs O(n) anyway.
I think I made number 1 clear in the video.
about 2, I'm talking purely about the computation in the video ( in isolation ).
But yes, ultimately what you're saying will happen in most cases 👍
Did you compile rust with production flag and at the highest level of optimization?
Yes. I'm running the command with the release flag and here's the release section in my cargo.toml file:
[profile.release]
lto = true
opt-level = 3
Considering people use Python for math calculations, and pyhton is slower than Javascript, I would argue that speed is not important for that :D
Basically I don't need this yet as a robotics engineering student, but I'm looking at it I guess 😄
Who knows… 😄
Incredible done video
werent the shaders that babylonjs and three js use based on gpu computation
Vertex and Fragment shaders are used for rendering and not compute.
You can hack fragment shaders and get them to do compute work but you can’t return the data to the cpu as easily and it’s def much more difficult to get it to work with some algorithms.
Compute Shaders are much more flexible.
Now, let me note that Three.js and Babylon both now have Compute Shaders support but for Three.js there is little to no documentation at this time.
you missed opportunity to use "mythbusters nvidia gpu versus cpu"
How to speed up Java script:
- Step 1: Use rust
- Step 2: Use C++
- Step 3: Use GPU programming language BUT for god’s sake DO NOT USE JAVASCRIPT
😂
Am I tripping? isnt runtime for 12:37 O(n^2)
can you explain ?
how did you make the like button do the rainbow effect? that was cool. Also it sucks that safari on ios doesn't have webgpu
I'm pretty sure it is a RUclips thing. it detects the word "Like" being said in the video, and highlights the like button. Seen it elsewhere too.
@@felleg4 that's what I was thinking, but as click the like button was said several times beforehand I wasn't sure
Really fantastic thing
Thanks !!
Try with zig
JS IS DEFINITELY NOT SLOW. JS is the fastest dynamic language there is, not only to run, but specially for implementing any type of algorithms, because it's an extremely expressive language, and it's the internet overlord. There's almost 30 years of optimizations making it run even faster then C++ in some scenarios. Google's V8 engine actually replaced almost 30% of its C++ code to JS to enjoy it's optimizations. You even reached the same conclusion in the video, running the same algorithm in JS faster than Rust. Even before all of the JS optimizations we have today, there was already 3D games running in JS, even before WebGL, that is the predecessor of WebGPU.
I agree with everything you said ✌️
For the GPU image, showing an external graphics card is kind of becoming passé. I type this on an iPad. Has GPU. Benefits from WebGPU. But no physical card.
Alternate image: highlight GPU cores of any modern APU chip.
Great suggestion.
I'll consider it for my next videos...
Did you do cargo build with a release flag???? For rust
Yeah I think wasm bindgen does that automatically.
@@visionary_3_d It doesn't, you need to add lto = true, and you would also like to use wasm-opt.
@@fow7139 that's right. I'm using both of those in my cargo.toml file:
[profile.release]
lto = true
opt-level = 3
@@visionary_3_d wasm-opt is not opt-level.
@@fow7139
ah ok.
I had no idea about that...
Check this out:
bytecodealliance.github.io/cargo-wasi/wasm-opt.html#disabled-when-wasm-bindgen-is-used
I was almost crying at the end of the video man
😄
@@visionary_3_d you think is a good option to use Babylon to exploit webgpu? Or any way to do it with three js?
@@richardramirez5746 If it's easier for you then sure.
Thank you!
You’re welcome 😉
(C++) + webGPU = pure power and performance 🙃
if the data is in gpu land
is it persistent or we can use malloc like functions to delete the unnecessary data after
the task has been completed
I don't think we have deep enough access to do that.
It's handled automatically at the moment.
JS devs will get you to use a powerplant to add 2 u8 integers, but will not for their lives optimize.
MAXIMUM COMPUTE POWER! WEBGPU!
"Every sum is happening at the same time" bro that's some quantum shit right there
😅😅
Actually JS is fast enough for many types of games that don't require crazy performance. I think the slowness mentioned here is exaggerated.
I agree with you that it works great for "many" games but remember that we don't have a single FPS game that's actually active on the web.
You are right. But the reason for that is not JS/WebGL performance in my opinion, but browser limitations on how much memory webapps are allowed to use and store.@@visionary_3_d
you definitely inspired me to learn Rust)
😁😁
Let Rust for Application Code or the cool kids out there.
Core Components, Library Code, or any heavy-lifting etc.. - that's C++ Land, and it's gonna take a while till something beats that.
All hail C++❤
You do know that you are using code from C++ professional to Rust Beginner?
@@phantombeing3015, excuse me, what?
Could you rephrase that? I'm not sure I got your point.
Thanks.
@theintjengineer the one who made this video knows C++ very well but seems to just be learning Rust. To compare performance, it would be better if someone who knows Rust well write and then compare.
@@phantombeing3015rust is fast and safe but being safe usually incurs a performance cost. C++ can optimise more aggressively than rust because it has undefined behaviour, but rust, ensures it's safe and has defined behaviour.
Compiling Rust to WASM is not at all a fair comparison with JavaScript since V8 runs that with no overhead. I don't know what the C++ compiler output, but this should be telling you something about the state of WASM and not of Rust.
So we could have JS-miners installed along with ads? Finally!
Nothing goog will ever come out of any ecma. JS-programmers are being referred as modern lumpenproletariat for a reason.
JS is not slow. Idk why people stick in 1999? JS is like half the speed of the similar c++ code, if the code is well optimized for JS engines (using typescript solve most of the problems).
I agree. Sometimes that 2x speed improvement helps a lot in games…
hi, great video, could you upload the code somewbere?
I will upload some parts of the code for sure.
( soon )
The particle system I can not.
@@visionary_3_d alright thanks 👍
on point subbed!
good goood goooooood! thanks!
Thanks for the gooood vibes ♥
@@visionary_3_d thanks for spending a lot amount of time for content for us! it’s so inspiring and after your video I love my C course much much more 🙂
This was interesting until you saw JS being faster than rust and didn't even question it and instead used it to imply that Rust is worse than C++ lol
I did try to make rust as fast as possible.
But something to consider is that wasm-opt doesn’t work with wasm bindgen.
So maybe I should’ve used a better compiler.
Blazingly fast
Greatfull
Greatttte man❤
Glad you liked it.
I've never heard someone say WGSL like that lmao
It’s how that word is pronounced 😅😅
This is great, I just don't get the WebGPU name. It's still a WebGL, a graphics library. I suppose this is a totally new project and you guys had other ideas or for whatever reason could not work with the WebGL folks, at least please name it something else, that makes sense. I don't know, WebGL-Next? Anything but this. WebGPU is kinda stupid for the awesome work you put in there. Thanks!!!
You must be doing something wrong with Rust, because it is not slower than C++ and certainly not slower than JavaScript. Maybe you are running an unoptimized debug build of your code, which can easily be 10x slower than an optimized release build. Or maybe you're implementing the algorithms in an inefficient way in Rust (for example, copying data unnecessarily when passing it to functions).
The rust code is available on github ( link in the description ).
in my cargo.toml file:
[profile.release]
lto = true
opt-level = 3
---------------
let me add that you may be right!
I'm not that experienced with Rust.
However I haven't been able to find why Rust is slow in this case.
I learn WebGPU, i wanna make cartoon object
Maybe I should make a video on that 🤔
I thought we were making js faster but i see rust all over the place
jeez some things just shouldn't be done. stop using javascript, my lord.
Theme is interesting but format is not, having everything you say displayed on slides is not very fun to follow (one can get the transcript for this), way too long vid than it should
Interesting feedback.
Thanks 🙏
showing numbers without code isn't that helpful tbh
I’ll publish it soon…
GPU? more like a baffling amount of JS at 2.0GHz
--release