2500% Perf Improvement in Node

Поделиться
HTML-код
  • Опубликовано: 12 сен 2023
  • Recorded live on twitch, GET IN
    / theprimeagen
    Reviewed article: gal.hagever.com/posts/my-node...
    Author: Gal Schlezinger | / galstar
    MY MAIN YT CHANNEL: Has well edited engineering videos
    / theprimeagen
    Discord
    / discord
    Have something for me to read or react to?: / theprimeagenreact
    Hey I am sponsored by Turso, an edge database. I think they are pretty neet. Give them a try for free and if you want you can get a decent amount off (the free tier is the best (better than planetscale or any other))
    turso.tech/deeznuts
  • НаукаНаука

Комментарии • 249

  • @mgdotdev
    @mgdotdev 9 месяцев назад +275

    Congratulations y'all finally discovered the Python equivalent of C extensions

    • @justdoityourself7134
      @justdoityourself7134 9 месяцев назад +4

      Exactly.

    • @LeoDutraBR
      @LeoDutraBR 9 месяцев назад

      Shitty languages mastery unlocked.

    • @Disorrder
      @Disorrder 9 месяцев назад +3

      nodejs supported C modules from the begining

    • @justdoityourself7134
      @justdoityourself7134 9 месяцев назад +5

      @@Disorrder And they've changed 20 times and suck compared to python. You forgot that part.

    • @Disorrder
      @Disorrder 8 месяцев назад

      @@justdoityourself7134 that’s fair

  • @girlswithgames
    @girlswithgames 9 месяцев назад +36

    21:00 rust intuitively pushed him to a better solution. that's pretty powerful

  • @rickmedina4082
    @rickmedina4082 9 месяцев назад +41

    (writes it in the slowest js possible)... "Boss, it looks like there's no other option but to write it in Rust 😇"

  • @batatanna
    @batatanna 9 месяцев назад +661

    Who would have thought that replacing JavaScript by literally anything else makes it faster

    • @ConcerninglyWiseAlligator
      @ConcerninglyWiseAlligator 9 месяцев назад +82

      Not 90% of web developers, i'll tell you that much.

    • @ShadowKestrel
      @ShadowKestrel 9 месяцев назад +50

      DHH on his way to rewrite the entire web in ruby

    • @AveN7ers
      @AveN7ers 9 месяцев назад +28

      even Python?

    • @flogginga_dead_horse4022
      @flogginga_dead_horse4022 9 месяцев назад +5

      enter Bun 1!!!

    • @greatbullet7372
      @greatbullet7372 9 месяцев назад +13

      my former team lead thought of JS being the cream of the crop and its perfomance irrelevant.

  • @FaraazAhmad
    @FaraazAhmad 9 месяцев назад +90

    Gotta keep in mind that Rust's lifetime notation being explicit enabled OP to find the right data structure whereas in JS it wasn't upfront how memory was being managed behind the scenes.

    • @garretmh
      @garretmh 9 месяцев назад +29

      My thought too. Rust solved the problem by helping them write it better rather than by running it better.

    • @rawallon
      @rawallon 9 месяцев назад +10

      Sounds like a skill issue

    • @JoeTaber
      @JoeTaber 9 месяцев назад

      Best take.

    • @glokta1
      @glokta1 9 месяцев назад

      Love your stuff

    • @JonathanDunlap
      @JonathanDunlap 9 месяцев назад +4

      JS allows the developer to mostly ignore memory memory management, and when performance issues are spotted, they can fix it by improving the hot spots in the code. Overall it seems pretty simple if they just stuck with JS and were more careful in their hot path.

  • @cookie_of_nine
    @cookie_of_nine 9 месяцев назад +61

    Ah the standard pattern:
    - Write JS code with a memory leak
    - Complain that GC is bad
    - Switch to language without GC
    Unlike what the article asserts, the problem was never the GC, other than it let them get away with having a memory leak until they decided to do something about their performance issues. They could have fixed the leak fairly easily, but instead chose to write a completely different algorithm in Rust (which would have worked for JS too) instead of trying to actually write good JS code.
    Yes this was a memory leak, sure a decent GC usually means that you don't have "leaks" in the traditional sense, i.e. by losing track of a block of memory so you can't free it, a GC can still keep memory around you weren't expecting, because there was a reference to it you didn't realise was there.
    In this case, even though the files were being processed "line-by-line" the code was keeping every line in memory by virtue of the small portions being kept in the permanent list of records. As a result, the entire file was read into memory and kept there for the entire operation. The GC could do nothing to help because the code was keeping valid references to each line. Sure it would all be correctly disposed at the end, so not technically a "leak", but the whole point of processing line-by-line is to avoid having to keep the entire file in memory, yet the code ended up doing exactly that.

    • @poolkrooni
      @poolkrooni 9 месяцев назад +1

      Amen 🤌

    • @oblivion_2852
      @oblivion_2852 9 месяцев назад

      Could you fix this by deep copying the data instead of referencing it? And if so how?

    • @monad_tcp
      @monad_tcp 9 месяцев назад

      17:40 that's what I always say, the problem isn't using GC, its the fact that having it makes it so easy to create trash, if you use a manually manage memory and do a lot of malloc/free, you are also going to be slow.
      Using the heap that way is bad for performance.

    • @kurdm1482
      @kurdm1482 8 месяцев назад

      great minds think alike, kudos.

  • @paulomarques742
    @paulomarques742 9 месяцев назад +23

    Nah , the real problem was between the keyboard and the chair.

  • @greendsnow
    @greendsnow 9 месяцев назад +93

    well... is he working at a day job? or does he do video streaming 24/7 at this point?

    • @perc-ai
      @perc-ai 9 месяцев назад +18

      Both bro lol he is staff arch at Netflix

    • @phantombeing3015
      @phantombeing3015 9 месяцев назад

      ​@@perc-aiwhats staff arch?

    • @PavitraGolchha
      @PavitraGolchha 9 месяцев назад +4

      He's moonlighting Netflix 😂

    • @foxooo
      @foxooo 8 месяцев назад

      Bro is so handsome he can do both

  • @andrewdunbar828
    @andrewdunbar828 9 месяцев назад +16

    And viola. Won't somebody think of the cellos?

  • @wi1h
    @wi1h 9 месяцев назад +19

    it's nice that in rust, if he wanted to do the same approach as the JS one, he would have had to explicitly clone the line to put it into a record to keep after reading the file, which should sound some alarm bells

  • @sk-sm9sh
    @sk-sm9sh 9 месяцев назад +30

    People like to bash stack overflow but these guys could had just went on stack overflow and asked "my js code runs slow, how to optimize it" paste their code and in a day they'd have a few tweaks to their js code that gets it to run 2000% faster - maybe not as good as 2500% rust but good enough - they would had many many hours saved from spinning 25 containers and rewriting code in rust.

    • @georgehelyar
      @georgehelyar 9 месяцев назад +18

      They probably would have had it flagged as a duplicate and downvoted to oblivion, but then they could have followed the link to the duplicate question. It's quicker than searching ;)

    • @retardo-qo4uj
      @retardo-qo4uj 9 месяцев назад

      Noob like to bash SO

  • @vinno97
    @vinno97 9 месяцев назад +69

    This entire thing looks like it would be a very straightforward Spark job

    • @crackerjackmack
      @crackerjackmack 9 месяцев назад +7

      Straightforward once you learn spark's ecosystem, which isn't simple.

    • @georgehelyar
      @georgehelyar 9 месяцев назад +4

      Yea this is already a solved problem and there are many tools that can do this job easily.
      One tool I've used before to do this is Azure data lake analytics. Ingesting some TSV using some U-SQL on a schedule and writing it to a database is easy.
      You should never try to reinvent the big data wheel.

    • @valerioversace9604
      @valerioversace9604 9 месяцев назад +20

      Guys it’s literally a for loop

    • @sammysheep
      @sammysheep 9 месяцев назад

      Spark or hive Or Impala or any schema on Read distributed analytics tool. This was my thought too.

    • @rawallon
      @rawallon 9 месяцев назад +1

      Lets be real, he could've just used chat gpt

  • @chepossofare
    @chepossofare 9 месяцев назад +71

    It remembers me the best advice i got in CS:
    use the right data structure and the program will write itself.
    (PLUS: Use the right tool for the job, and Js is not in this case)

    • @ThePrimeTimeagen
      @ThePrimeTimeagen  9 месяцев назад +21

      that is the case

    • @sk-sm9sh
      @sk-sm9sh 9 месяцев назад +9

      JS maybe isn't best choice but isn't terrible either it's fast garbage collected language and well optimized and as mentioned in article they already have node.js ecosytem (even their rust code ended up embedded in js) - then nodejs makes sense but what they did with Rust module was just completely unnecessary overkill. It was so unnecessary that I suspect they intentionally didn't show hashmap version of javascript because their managers would been asking some tough questions about why did they had to spend all this time working on Rust version that is just some marginal % better all while only introducing more complicated toolchain and build process, but the way they written it as is, they just gonna sell it as big win - shit like this happens all the time when managers in tech don't have strong grounds in computer science and can't read through bullshit.

    • @karakaaa3371
      @karakaaa3371 9 месяцев назад +4

      Even JS should have no problem parsing some TSV files? How is this not just disk/network limited?

    • @dyaadin8025
      @dyaadin8025 9 месяцев назад

      This, so true

    • @SimonBuchanNz
      @SimonBuchanNz 9 месяцев назад +1

      @@karakaaa3371 I did the work. It's a bit painful to parse tsv streaming in Node, but with a bit of work I got it processing 2GB in about 20s, and with about 2 hours of work optimized that down to 7.7s using the least natural code possible to avoid any allocation: all indices into typed arrays.
      The trivial, a baby could write it, a third the size, version in Rust was 7.2s.
      So yeah, realistically shouldn't have been *so* much slower, but if you care about both performance at all and maintainable code, jumping to Rust is hardly dumb.

  • @LukasSkywalker_
    @LukasSkywalker_ 9 месяцев назад +29

    As a Node developer, I can't agree more. It is so easy to create memory in node that it fills the memory with garbage.
    For any intensive task, other tools must be considered before starting any project. For these I mainly use Go (love it).
    I may try Rust again soon

    • @FlanPoirot
      @FlanPoirot 9 месяцев назад +4

      yeah same. so many people brush off performance and consider "choosing a fast language" as a "early optimization". using more resources than needed for anything on a computer should be frowned upon even a naive implementation on a decently performant language is faster than these interpreted languages people shove everywhere nowadays.
      they seem to forget that no program runs in isolation, if everyone makes everything in these languages eventually we'll have thrown all the improvements in CPU speeds we've had throughout the years...

    • @phoenix-tt
      @phoenix-tt 9 месяцев назад +2

      Rust is simply amazing when coupled with Node/Web using Napi/WASM.
      Definitely worth learning

    • @pbdivyesh
      @pbdivyesh 9 месяцев назад +1

      I'm weird junction in my career with 7 YOE experience with JS & TS. Can you please shed some light on whether I should seriously start learning GoLang, Zig or Rust?
      I mostly know working with threads and passing data through it and while learning GoLang it felt more naturally growing into me.
      But still would like to know what should I choose to make myself more impactful and payable!

    • @FlanPoirot
      @FlanPoirot 9 месяцев назад

      @@pbdivyesh Zig has no industry yet, it's a niche and the compiler is not even 1.0 yet. There's still some rough edges like interfaces and its async runtime, there's also plan to refactor the standard library and change naming conventions
      Go has a lot of use in industry, a lot of server infrastructure tooling is in it now, the modern internet kinda "relies" on some tools made in it to run, it's a very traditional language tho, it's pretty minimal too, I like it but some people don't because it's not as strongly typed as rust.
      rust is very nice, but it has a bit of a steep learning curve, it's till finding it's place even tho it's very popular amongst open source projects rn and big companies have partially written their code in it.
      Go is still much more used than Rust I think. You're more likely to get a Go than a Rust job too imo. But Rust, Go and Zig are kinda slightly different in ways that matter to make all of them worth learning (tho learning zig right now is not for everyone as it's still not 1.0, maybe learn C instead for the time being if u want something in the low level minimalist language niche)

    • @bdkamil95
      @bdkamil95 9 месяцев назад +4

      @@pbdivyeshpayable? Out of the three you’ve posted? Go.

  • @jf3518
    @jf3518 9 месяцев назад +6

    17:00 That's exactly what I was wondering, when seeing line.split being used. Why are they not just using a simple buffer fread approach and reading it line by line instead. No promises, just a single hashmap and buffer and processed line, that's it.

  • @manofacertainrage856
    @manofacertainrage856 9 месяцев назад +2

    The question becomes: What would the JS times be if the data structures were altered to conform to the Rust solution? Did they need to add an extension to do this - and if so, would finding a new solution become a critical "must do now" task? How much time would that have saved by fixing a few lines of JS? Granted, writing the Rust extension was a win, but it appears they problem was found but not recognized during reimplementation - which is the flip side to the "now I understand why they did it that way" realizations during reimplementations.

  • @simonfarre4907
    @simonfarre4907 9 месяцев назад +3

    I approve of this message. Data structures is ALWAYS where it's at.

  • @lacherolachero9409
    @lacherolachero9409 9 месяцев назад

    I love your take here! Good job

  • @nimmneun
    @nimmneun 9 месяцев назад +7

    Appreciate the article bcs it gave me some ideas regarding plugins, but I feel like this was a case of not being overly proficient in js+node+profiling and using a document database for structured data you want to aggregate, which just doesn't make sense. 😅
    And on top of that, having a TSV/CSV etc is basically the best case, since you can go line by line. Imagine formatted XML or JSON.

  • @EVGizmo
    @EVGizmo 9 месяцев назад +2

    The sun emoji had me smiling the entire time

  • @voidwalker7774
    @voidwalker7774 9 месяцев назад +32

    The Node vs Bun speed wars are ON !!!

    • @zerdox4
      @zerdox4 9 месяцев назад

      you can also import rust into bun using some plugin AFAIK

  • @Pedriniist
    @Pedriniist 9 месяцев назад +5

    This is a classic Map Reduce problem. What about Apache Spark or big machine and DuckDB?

  • @bob80808
    @bob80808 9 месяцев назад

    I learn so much with these videos 😂 it's amazing

  • @ma77bc
    @ma77bc 9 месяцев назад +1

    I would love a video on performance related to data structures.

  • @connorsheehan4598
    @connorsheehan4598 9 месяцев назад +8

    @8:18 why did prime say "don't use a promise" to eliminate idle time? is it cause it halts the program to wait for the response and instead it should just start running the queries in parallel?

    • @ThePrimeTimeagen
      @ThePrimeTimeagen  9 месяцев назад +16

      callback. there is no extra event loop iteration required
      performance critical paths should avoid promises

    • @SimonBuchanNz
      @SimonBuchanNz 9 месяцев назад

      ​@@ThePrimeTimeagennot exactly sure what you mean by event loop iteration: promise callbacks go on the microtask queue, so technically they all get cleared before the event loop continues.
      It's slower than immediately running if only because of the allocation and VM call stack, sure, but it shouldn't be polling for IO slow.

  • @RandomGeometryDashStuff
    @RandomGeometryDashStuff 9 месяцев назад +11

    09:15 doesn't async iterator create promise and {value:something,done:something} object on every iteration?

    • @ThePrimeTimeagen
      @ThePrimeTimeagen  9 месяцев назад +3

      yes

    • @monad_tcp
      @monad_tcp 9 месяцев назад

      @@ThePrimeTimeagen that's stupid, even C# has a state machine object that's kept for the entire duration of the async and they just replace a field value with the object returned from async.
      Why people use NodeJS instead of C# or Java is beyond me....

    • @user-xw5tj4cb8x
      @user-xw5tj4cb8x 4 месяца назад +1

      @@monad_tcppromises are state machines

  • @der_pudel
    @der_pudel 9 месяцев назад +2

    > There might be certain issues that JavaSctipt simply can't solve efficenetly
    Damn, I laughed so hard on this line 🤣

  • @KingDJRule
    @KingDJRule 9 месяцев назад +2

    "not even the simplest projects are that simple"
    totally feel that!
    reworking a simple application, that uses a node package to "screenshot" what user data a website stores
    you can enter some custom url's, it saves those url's and the output paths to the browsers local storage
    now it should run on a server with node and storing it to a postgres database instead of storing all the url's and output paths to the local storage
    I estimated 4 hours for this
    I only worked with mysql and didn't even know, there are little but meaningful syntax differences (like wtf, both are using SQL??)
    had to fight permission issues on the linux system (it was designed to work on windows)
    now it's round about 6 hours in and it's like 80-90% done

  • @timschannel247
    @timschannel247 8 месяцев назад

    Nice contributions. I love your style of rythm to information. However, my question did somebody investigated into using a Set instead of a map? At least a set forbids duplicates, which could even save you from problems. Thanks and best Regards, Tim Susa.

  • @galen__
    @galen__ 9 месяцев назад +1

    23:43 - Celebrities Explain DevOps has Flavor Flav explaining Kubernetes at around the 25sec mark

  • @mohamedaityoussef9965
    @mohamedaityoussef9965 9 месяцев назад +6

    0:17 9/11 wasn't 3 days away from sep 6th

  • @v2ike6udik
    @v2ike6udik 8 месяцев назад

    8:41 i come here to get a daily boost of laughter. moist balls, upper section.

  • @rosehogenson1398
    @rosehogenson1398 9 месяцев назад +6

    Definitely feels like they could have spent 30 minutes profiling the JS version lol

  • @jordanmowry9164
    @jordanmowry9164 8 месяцев назад

    😂 I always have a good time watching this guy.

  • @josevargas686
    @josevargas686 9 месяцев назад

    Damn I had no idea about vmrss, that's so useful!

  • @xpusostomos
    @xpusostomos 7 месяцев назад +1

    He's reading 240gb, and splitting each line, and looking at the http response code. What he should do is read the line character by character, find the nth tab where the code is and decide if the code is >200, which presumably is a tiny fraction of the total. Now he he would eliminate 99.9% of the memory garbage for valid responses, and only process the error lines with split, and it might just work in node.js.

  • @matthewrideout2677
    @matthewrideout2677 9 месяцев назад

    would be interested in a cost benchmark comparison between bun / fastify / go

  • @tharinda
    @tharinda 9 месяцев назад +3

    Bruh that query 😂

  • @martijn3151
    @martijn3151 7 месяцев назад +1

    When seeing this I’m a bit saddened that people would even think of all these crazy solutions like scaling it up to 25 servers using docker containers, while the solution is so simple: use the right tool for the job. I’ve the feeling that a lot of programmers out there are missing some core fundamentals and are just throwing more CPU and memory at a problem, when it doesn’t perform well. As opposed to truly understanding what’s happening.

  • @outstruck
    @outstruck 9 месяцев назад

    Tell me what should i learn for backend i know react js , rust do i need to learn node js? Or tell me any good way

  • @brod515
    @brod515 9 месяцев назад

    what is that profiler tab he has in the browser. I don't see it on my browser.

  • @RoflMcCopter
    @RoflMcCopter Месяц назад

    I have done this in JS and the memory issues are easy to solve: MySQL temp table. Our analytics are stored in MySQL instead of Mongo because I'm smart, so I can read through a CSV from S3 50MB at a time, parse the lines from it (yes I use .split("
    "), sue me), and store the results in a temp table.
    This way, if something goes wrong, the main data isn't polluted with partial data we need to query for and clear; we also can have the SQL server copy the data to the main table and drop the temp once everything is kosher.
    I kinda want to see if i can get a 200GB CSV to try to see how long it would take to parse, but I also would probably migrate to Rust at that stage lol

  • @coder_one
    @coder_one 9 месяцев назад +4

    If swapping one Node module from C++ to Rust gave such a kick, where did the developers of Deno (which is written in Rust) go wrong?

    • @apollolux
      @apollolux 9 месяцев назад +5

      It wasn't swapping a module from C++ to Rust, it was taking a nearly completely unoptimized Node-only loop and putting some functionality in a Rust-based module and rewriting the Node-based side a bit to compensate.

  • @qorzzz9252
    @qorzzz9252 8 месяцев назад +2

    Seeing stuff like this really makes you wonder what else they don't understand... Imagine believing that the reason memory went down ~220x is due to switching to rust lol.

  • @keyboard_g
    @keyboard_g 9 месяцев назад +2

    Single node computer… This man is living in the year 2099. What’s next, statically linking your serverless functions into a single binary executable?

  • @nordern1
    @nordern1 9 месяцев назад +22

    Rule of thumb, the inherent speed and efficiency of a programming language is rarely enough to overcome a poorly written algorithm/data structure.
    Porting your bad JS code to rust will only create bad rust code.

    • @chudchadanstud
      @chudchadanstud 9 месяцев назад +6

      Unfortunately it is enough. Your print inside a for loop will take you further in c than in python.

  • @shahidullahrahman7968
    @shahidullahrahman7968 9 месяцев назад +1

    AWS glue is a serverless solution that could’ve eliminated most of these infrastructure complexity and speed up the batch job. Given they already use s3 to store the data, they could’ve used an event driven pattern.

  • @juxuanu
    @juxuanu 9 месяцев назад +3

    Honestly this seems like something relatively easily solvable with node and some lambdas consuming a queue way better than what they had. Just use some rxjs...

  • @climentea
    @climentea 9 месяцев назад +1

    Polars + Python + VPS with enough RAM/CPU and done 😂 25 instances!!

  • @noherczeg
    @noherczeg 9 месяцев назад

    26:40: esbuild does not replace TSC. If you want type checking not just compiling, then you need to run tsc explicitly.

  • @TFitz
    @TFitz 9 месяцев назад

    Birthday's are a great piece of information for those with ill intent. Happy birthday! I may be missing the point. I am a Sept baby too

  • @mfpears
    @mfpears 9 месяцев назад

    It's okay to use JavaScript for its convenience and ability to be isomorphic for resumability, and it's also okay to do any expensive algorithm in another language.

  • @DROWN.
    @DROWN. 9 месяцев назад

    5:20 got me 💀💀💀

  • @philosophia5577
    @philosophia5577 2 дня назад

    Dart has Macros, Flutter rendering Engine renders 3D.

  • @nilfux
    @nilfux 7 месяцев назад

    Mongo aggregation is a gamechanger. It's stages, of course if you get the wrong order it screws you, it's linear. Derp.

  • @LordOfCake
    @LordOfCake 8 месяцев назад

    At 21:55 he types "vmrss" - is that a custom script? I haven't found any such tool online and it's not available by default on my Ubuntu system either.

  • @nathansodja
    @nathansodja 9 месяцев назад

    I wanna be like you when I grow up Mr Primeagen

  • @juliocesartorrescama5661
    @juliocesartorrescama5661 9 месяцев назад

    what command did u ran? VmRSS share ;(

  • @steveAllen0112
    @steveAllen0112 9 месяцев назад

    Seems like this would be a perfect use case for awk maybe?

  • @franklemanschik4862
    @franklemanschik4862 9 месяцев назад

    you got that wrong the idle time comes from the nodejs eventloop to be honest it is libuv which always waits for additional devices to respond. if you would run your code directly on v8 or a other engine then node eg es4x which can use the so called epoll epoll is the kernel wait cycle which is much faster. hope that gives you some insights.

  • @fennecbesixdouze1794
    @fennecbesixdouze1794 9 месяцев назад

    This whole thing could have been a one-liner in awk and it would have performed just as well.

  • @filiplintner4262
    @filiplintner4262 8 месяцев назад

    I would like to understand how many sprints they spent on it and why simply not used emr or aws glue 😅

  • @josefaguilar2955
    @josefaguilar2955 9 месяцев назад

    8:41 is why I live here now.

  • @kamiljanowski7236
    @kamiljanowski7236 8 месяцев назад

    Lol. I had a data funneling job in Paramount. TBs of data was downloaded, then processed and stored in a new file and then upload to a DB 😂 we would get new data files once a week. It became problematic when the process started taking 8 days 😂
    I started streaming it and suddenly it was taking hours, not days. Then I split it across multiple servers and suddenly I the 8 day job was taking 15 minutes 😂

  • @NostraDavid2
    @NostraDavid2 9 месяцев назад +2

    200GB / 25 machines / 3 hours = 740,74 KB of data per second, per machine...
    Holy smokes that's bad. simdjson claims to parse _gigabytes_ per second, as a frame of reference.

  • @cassandrasinclair8722
    @cassandrasinclair8722 7 месяцев назад

    Maybe I am missing something, but doesn't JS have iterators?
    Why not have some iterator over the files and lines, extract the values, and reduce by counting?

  • @JakobKenda
    @JakobKenda 9 месяцев назад

    viola 😂 i can't believe you never heard of the word

    • @fenxis
      @fenxis 9 месяцев назад

      To be fair it probably should be spelt: violà.

  • @skeleton_craftGaming
    @skeleton_craftGaming 9 месяцев назад

    2:24 getting padded down by the csv... .

  • @cjjavellana
    @cjjavellana 9 месяцев назад

    Is voila pronounced as "vyola" or "vwala"?

  • @1989DP3
    @1989DP3 9 месяцев назад

    Considering the files were in fixed format, could he have not use precompiled regex to match atleast the status code part and not split until necessary???

    • @vytah
      @vytah 9 месяцев назад

      Probably the percentage of non-successful responses was too low to bother

    • @1989DP3
      @1989DP3 8 месяцев назад

      could be, but the article never provided much info, like mentioned in the video, there was no actual profiling that could've give us some idea what exactly was causing the issue

  • @randomdamian
    @randomdamian 6 месяцев назад

    10:36 Couldn't he create a variable with the object
    `let foo = { pathname: x, referrer: x }`
    and then after the push do
    `delete foo`

  • @lian1238
    @lian1238 9 месяцев назад +2

    I tried wix once. It was horrible. The platform had limited functionality. It was slow. The tools they provided were slow. Now I kinda get why.

  • @StuartLoria
    @StuartLoria 9 месяцев назад +11

    How I would love Prime, to explain why he does not like something, provide an alternative and explain why that is better, I feel I’m only getting half the story with his narrative style.
    It’s fun, but it is also patronizing and pretentious, he could change easily, but he seems happy with his current style, so I’ll keep searching for useful bits from his frat talks and research on my own, thanks for the pointers Agen.

    • @Adowrath
      @Adowrath 8 месяцев назад +1

      I feel like, being a livestream his reactions are from at all, it'd just get tiring if he had to re-explain his stance over and over again everytime it came up, and that's the reasonn he doesn't do it.

  • @rakeshchowdhury202
    @rakeshchowdhury202 9 месяцев назад +21

    "I Ditched javascript and typescript for backend and desktop apps and my code runs as fast as C++ and Rust"
    No shit Sherlock

  • @adriankal
    @adriankal 8 месяцев назад +2

    It looks like they found different tutorials for js and rust and followed them literally. Then they measured and wrote this blog post. By chance rust implementation won.

  • @xpusostomos
    @xpusostomos 7 месяцев назад

    This is probably a case to break out your c compiler, I'm sure you could blow away these rust numbers. C code could use basically no memory, and run even faster still, and the processing is simple.

  • @odysseasragnarson7295
    @odysseasragnarson7295 9 месяцев назад

    what happened to the elixir video?

  • @TheOrionMusicNetwork
    @TheOrionMusicNetwork 9 месяцев назад

    Surely this would be trivial to do in Spark / Databricks. Feels like a lot of time was wasted....

  • @pdougall1
    @pdougall1 9 месяцев назад

    Go and arenas would also be pretty sweet

  • @HyperionStudiosDE
    @HyperionStudiosDE 9 месяцев назад +2

    I'd still like to know if and how this would have been possible in JS. Just blaming JS before finding the exact cause seems a bit lazy to me.

    • @poolkrooni
      @poolkrooni 9 месяцев назад +1

      Very possible. Maybe not 25x optimizable but perhaps 15-20x without spending a huge amount of time on a complete rewrite

  • @jeremycoleman3282
    @jeremycoleman3282 9 месяцев назад

    Uh, hey wix hire me

  • @JulianAndresGuarinReyes
    @JulianAndresGuarinReyes 9 месяцев назад

    vooh - a - lah

  • @strixhooligan2347
    @strixhooligan2347 7 месяцев назад

    Anyone heard of Spark at Wix?

  • @ranadatascientist
    @ranadatascientist 8 месяцев назад

    For processing this amount of data, Apache Spark would have been a good solution. PySpark with Python will be the easiest path.

  • @limbo3545
    @limbo3545 9 месяцев назад +2

    Streaming data would have improved it significantly without the need of Rust. Unfortunately streaming in Node is kind of PIA

  • @SimonBuchanNz
    @SimonBuchanNz 9 месяцев назад +1

    So i just tried writing a basic version of this with pretty good, streaming node code that avoided any allocation, and the straightforward Rust code.
    I created a 2Gb tsv, 100M rows, with some garbage first two columns and a third column being the index % 200.
    After a bunch of careful iteration and lots of debugging and optimization, including manually parsing the number to avoid allocating a string, I got a 70 line accept that ran in about 20s. Pretty good, that's 100MB/s!
    The Rust code ran in 7s the first time, and was only 24 lines.

    • @SimonBuchanNz
      @SimonBuchanNz 9 месяцев назад +1

      Hey, good news! After a bunch of profiling and tuning the buffer size, I got the node version down to only 7.7 seconds, to Rust's 7.2 seconds!
      It required replacing every natural API with direct indexes and typed arrays, but you *can*, with a lot of effort, get JavaScript close in performance to incredibly basic Rust code!

  • @pbdivyesh
    @pbdivyesh 9 месяцев назад

    I'm weird junction in my career with 7 YOE experience with JS & TS. Can you please shed some light on whether I should seriously start learning GoLang, Zig or Rust?
    I mostly know working with threads and passing data through it and while learning GoLang it felt more naturally growing into me.
    But still would like to know what should I choose to make myself more impactful and payable!

    • @IronJmo
      @IronJmo 9 месяцев назад +1

      Zig and Rust are great and you should definitely consider picking up one or both at some point. That being said, there are not a lot of jobs currently for those languages.
      If you are looking to increase your compensation in the short term, consider a language that's already in high demand such as GoLang, or C#. Or break into a new specialty field that interests you. I think either of those will help you gain traction much quicker. I have seen front end devs pick up golang fairly quickly. I work in C# for my day job and there is plenty of work.
      If time isn't a concern, learn what you want. Rust is growing like crazy and Zig is starting to gain traction. There will be a point when there are more jobs for both.

    • @pbdivyesh
      @pbdivyesh 9 месяцев назад

      @@IronJmo thank you for your input. I certainly feel friendly homely and at ease with GoLang as things come naturally to me.
      That being said RUST always sounds opinionated.
      I think I'll dive deep with Go and practice implementing all those microservices, messages queue, context cache sharing, threads concepts.
      Zig attracted me as it felt like C and Javascript with same kind of syntax

  • @ilearncode7365
    @ilearncode7365 9 месяцев назад

    I would guess that most websites and SaaSes dont need to parse a 200GB file every day

  • @REDIDSoft
    @REDIDSoft 9 месяцев назад

    It seems that our friend in the article could have done everything, without rust with 1% of the resources used if he knows devops and good computing practices LOL.

  • @Tony-dp1rl
    @Tony-dp1rl 7 месяцев назад

    Yeah, this was mainly a bad algorithm in JS, this sort of processing would be completely IO limited, even in JS.

  • @filippoWUT
    @filippoWUT 9 месяцев назад +1

    x25 != +2500%

  • @bas080
    @bas080 7 месяцев назад

    Most SQL databases allow for importing CSV/TSV FAST!

  • @jamesclark2663
    @jamesclark2663 9 месяцев назад

    But is there something than makes javascript 2500% less obnoxious in Node.js? That's really the reason I don't like using it.

  • @newsofthenerd
    @newsofthenerd 9 месяцев назад +1

    no one mentioned this vmrss tool he is using. I can not verify that it exist. So is it his own cli tool or a script or does it actually exist on linux some where.? I wrote a bash script that looks to do the same thing here.
    #!/bin/bash
    while [ -f "/proc/$1/status" ]; do
    current_kb=$(grep VmRSS /proc/$1/status | grep -Eo "[0-9]+")
    current_mb=$(bc

  • @ajadavis2000
    @ajadavis2000 9 месяцев назад

    voila x2

  • @valerioversace9604
    @valerioversace9604 9 месяцев назад

    The most preoccupying thing about all this is that the person writing the article probably has a CS degree.

  • @MosiurRahman-dl5ts
    @MosiurRahman-dl5ts 9 месяцев назад +1

    Array vs Map
    Bro needs to go back to school

  • @YuriNiitsuma
    @YuriNiitsuma 8 месяцев назад

    This guy made an Apache Spark with nodejs???

  • @sub-harmonik
    @sub-harmonik 9 месяцев назад

    how is embedding rust more of a game-changer than embedding any other compiled manual-memory language like c?

  • @JorgetePanete
    @JorgetePanete 9 месяцев назад

    Wow, I can't wait for Rust on the Server to replace Javascript on the Server!