New Go Billion Row Challenge w/ Great Optimizations | Prime Reacts

Поделиться
HTML-код
  • Опубликовано: 5 ноя 2024

Комментарии • 215

  • @ivanovcharov7534
    @ivanovcharov7534 7 месяцев назад +297

    OMG ITS MY FAVOURITE PROFESSIONAL YAPPER!

    • @yaaaayeet745
      @yaaaayeet745 7 месяцев назад

      5 DOLLARS A MONTH 🗣🗣🗣🗣🗣✋✋✋✋✋

    • @apexdude105
      @apexdude105 7 месяцев назад +32

      "professional yapper" what a good job description for a streamer lmao

    • @charlesyoung601
      @charlesyoung601 7 месяцев назад +4

      nl clears

    • @oat1000
      @oat1000 7 месяцев назад

      nl my goat ​@@charlesyoung601

    • @jostasizzi818
      @jostasizzi818 7 месяцев назад +1

      Why do I feel this is every so called tech RUclipsr right now

  • @neruneri
    @neruneri 7 месяцев назад +109

    Asking Flip to take something out seems like the most reliable way to ensure that it absolutely does not get taken out.

  • @R4ngeR4pidz
    @R4ngeR4pidz 7 месяцев назад +185

    Narrator:
    Flip did, in fact, not take that out (16:00)

    • @teejaded
      @teejaded 7 месяцев назад +6

      Flip. Take this anti-flip propaganda out.

    • @flipmediaprod
      @flipmediaprod 7 месяцев назад +13

      I stand against the establishment

    • @Kannatron
      @Kannatron 7 месяцев назад +4

      @@flipmediaprod truly and upstanding and forward thinking editor. You kept it in for the people, 👏🤯🤯🤯

    • @GermanClaus
      @GermanClaus 7 месяцев назад

      He sounds like he is begging :D

  • @MHarris021
    @MHarris021 7 месяцев назад +10

    Tip for remembering stalagmites and stalactites. "Stalagmites have a g for ground and stalactites have a c for ceiling", it's how I remember which is which. It was a tip in a Xanth novel by Piers Anthony. I think it was "Man from Mundania", but I'm not sure because I haven't read them in 20+ years. Gosh, that makes me feel old. :)

    • @retropaganda8442
      @retropaganda8442 7 месяцев назад +1

      Ahaha, the true mnemonic is actually just the etymology of the word. I don't know if it's Latin or Greek, but for example, in french it's m for monte (raise) and t for tombe (fall). Simple.

    • @collinstasiak4994
      @collinstasiak4994 7 месяцев назад

      Stalagmite sounds like dynamite and you don't wan to put that on ceiling is how Ive always remembered it

    • @Eutropios
      @Eutropios 7 месяцев назад

      Stalactites stick tight to the ceiling. Stalagmites might grow upwards

  • @Sw3d15h_F1s4
    @Sw3d15h_F1s4 7 месяцев назад +56

    the JDSL implementation would be 10x faster. Tom's a genius!

    • @jerichaux9219
      @jerichaux9219 7 месяцев назад +4

      JDSL would have melted the CPU from how fast it would be parsing those rows.

  • @Thorarin
    @Thorarin 5 месяцев назад +8

    FYI: Buffer size of 1024 is terrible, because most modern disks use 4kB sectors nowadays. So some multiple of 4kB is immediately better.

    • @LtdJorge
      @LtdJorge 3 месяца назад +2

      True, but I don’t really know why everyone is doing buffered reads. The challenge says that the file is put in a RAM backed fs (some tmpfs) before running the program. Best way is to just mmap it which is zero-copy and zero-alloc.

    • @Musikvidedo
      @Musikvidedo 3 месяца назад

      ​@@LtdJorgeyep. The file reading and parsing can be improved quite a lot. Even the original java guys did a lot more

  • @strangnet
    @strangnet 7 месяцев назад +31

    Wow: a 4.7HGz with 6000mhz memory. Those millihertz come in handy with the HenryGigaz processor...

  • @sanderbos4243
    @sanderbos4243 7 месяцев назад +7

    It drives me bonkers how they used 10 instead of '
    ', and even went so far as to describe the magic integers at 28:21 with comments like "if b == 45 { // 45 == '-' signal"

  • @hierax49
    @hierax49 7 месяцев назад +76

    the author has a brazilian name. brazil mentioned

    • @rawallon
      @rawallon 7 месяцев назад +5

      Dev do Gamers club do fallenzão (2:06)

    • @Thoer
      @Thoer 7 месяцев назад

      let's go!!!

    • @user-zg2bx4oz2p
      @user-zg2bx4oz2p 7 месяцев назад

      It is also a Portuguese name

    • @microcolonel
      @microcolonel 7 месяцев назад

      Nobody lives in Portugal 😂​@@user-zg2bx4oz2p

  • @jackevansevo
    @jackevansevo 7 месяцев назад +3

    I love these posts, there's a lot of tidbits of information to learn.

  • @michealkinney6205
    @michealkinney6205 7 месяцев назад +6

    "Managers be like push it to prod! We're done... Good enough!" @ 20:16. Lol, like every non-technical manager ever.

  • @andyvisser
    @andyvisser 7 месяцев назад +4

    My guess on the read buffer and diminishing returns: I bet you get max performance when the buffer size aligns with the underlying hardware's size. Like it's best when you read a sector at a time (or however SSDs are addressed/broken down in firmware).

    • @TehKarmalizer
      @TehKarmalizer 7 месяцев назад +1

      Or file system block size. Typically reading in multiples of the block size is most efficient.

  • @i_sometimes_leave_comments
    @i_sometimes_leave_comments 7 месяцев назад +2

    4:35 Assuming go's `map` is a self-growing (via reallocation) array (like C++ `vector` or C# `List`), as the `map` grows, you'd have to mem copy the whole underlying array, and a bunch of pointers would be way cheaper than a `struct`

    • @anon1963
      @anon1963 6 месяцев назад

      you can do vec.reserve(n) in c++. eliminates need for expensive reallocation

  • @rapzid3536
    @rapzid3536 7 месяцев назад +4

    mmap
    Split the memory space into the number of Cores
    Hand out pointers start/end to threads
    Walk all but the first pointer start forward until after the next new line or EOF.
    Start ripping from there.
    Profit.

  • @Olodus
    @Olodus 7 месяцев назад +7

    Dammit, now I feel like I will have to do this in Zig or something... But great article. Really shows the experimentation and learning process.

  • @MrDadidou
    @MrDadidou 7 месяцев назад +20

    French gang:
    Stalag-mite (M like "monter" in french, to go UP)
    Stalag-tite ( T like "tomber, to fall)

    • @_kostant
      @_kostant 7 месяцев назад +2

      Always remembered it from the C in stalactite being “ceiling” lol.

    • @OnStageLighting
      @OnStageLighting 7 месяцев назад +2

      'might go up, tights come down.'

    • @microcolonel
      @microcolonel 7 месяцев назад +2

      ​@@OnStageLightinggiggity

    • @itsthesteve
      @itsthesteve 7 месяцев назад +2

      Stalag (ground), Stalac (ceiling)

  • @metropolis10
    @metropolis10 7 месяцев назад +13

    Primeagens reactions in this video "wow that's a lot slower than I would have thought... well I GUESS it is a BILLION items" x1 Billion

  • @michaelgreenberg6344
    @michaelgreenberg6344 7 месяцев назад +8

    On his hardware, he's I/O bound and any optimization is useless.
    Dude has 32 gigs of RAM. Meaning that, on an idle enough system, most of that memory will be used for file system cache, into which a file with the size of 13GB fits quite neatly.
    I will probably not be too exaggerating if I say that he only read the file from disk once - the first time he ran his program. If not once, then by the fifth run, the entire file would be up in RAM for sure. All the rest of the "I/O" tests were performed against the memory, which just checked how fast memory copy in chunks of different sizes and multiples of allocations can be performed. Had he been performing actual I/O, there's no way he'd be getting >13GB/s (which a time of ~0.98s suggests.)
    In fact, his drive is rated at 497MB/s (manufacturer spec), so on that hardware, it's useless to play with the buffer size, since you won't be reading the file faster than ~27 seconds, as the first file read test with the buffer size of 1024 would suggest. 13*1024/497=26.78, and i'm pretty sure that all the allocations were done during iowait, so it's safe to assume the file size is not exactly 13GB, but more around 13.3-13.5 :D
    This article is written by someone who probably doesn't understand storage or operating systems too well (using windows for development - first hint... jk,) but it's a nice experiment to see how well you can optimize such an algorithm if your disk bandwidth is infinite.

  • @hinzster
    @hinzster 7 месяцев назад +13

    Oh damn, for-loops are now considered boomer loops? What about while(true)/break loops? Are those dinosaur loops?

    • @hinzster
      @hinzster 7 месяцев назад +3

      Also, back when I was doing that obscure shift organizer program for hospitals, I used my own fixed point package to optimize stuff - everything was one single digit of precision anyway, so I just worked with ints determining 10ths of hours (another "problem"). Worked well, fast, and didn't use as much space as those pesky floats. I did this before the FP coprocessor was included in intel processors (ie. before the 486. My actual development machine was an original IBM PC XT, running an 8088 at 4.77MHz! I needed all the speed I could get).

    • @weakspirit_
      @weakspirit_ 7 месяцев назад +1

      nah, the dinosaur loops are the asm branch loops 🦖

    • @FakeDumbDummy
      @FakeDumbDummy 7 месяцев назад

      Well, go don't have while loops, so yes dinosaur loop for me

    • @SandraWantsCoke
      @SandraWantsCoke 7 месяцев назад

      Those are biblical times loops

  • @kodekata
    @kodekata 6 месяцев назад

    A Goroutine is Go's syntax for Tony Hoare's Concurrent Sequential Processes (CSP, not like the browser's CSP though). Fun fact: the creator of Go had made several previous languages, all with CSP baked in. In Clojure[script], the simple syntax for CSP was enabled via a library.
    CSP has been implemented in JS via generators, but there are implementations with more usage (eg. for Clojure).

  • @rogerdinhelm4671
    @rogerdinhelm4671 7 месяцев назад +1

    Current top Java implementation reaches 300ms, but measurements are done on reference hardware (32 cores / 64 threads), and thus might be different to whereever the Go guy was running it at.

  • @SimonBuchanNz
    @SimonBuchanNz 7 месяцев назад +4

    I did some basic aggregation with node on a 2 GB ini file: from memory with a bunch of work i got it down from 40s done somewhat naturally to about 7s done by a crazy person. The dumb 10 line Rust code took 3s or something.

  • @PhilipAlexanderHassialis
    @PhilipAlexanderHassialis 7 месяцев назад

    I like how its from 95s to 1.96s whilst inside the article a sub-second result is mentioned.

  • @danielmccann2979
    @danielmccann2979 7 месяцев назад +8

    For one second I read that as milli hz of ram and was like why is you ram going only 6 hz, are you manually clocking that thing

  • @JackDespero
    @JackDespero 7 месяцев назад +7

    I am sorry, but you are wrong.
    Boomer loops are GOTO and CONTINUE loops.
    The simulation code that we use at work was written in modern FORTRAN (FORTRAN 77, not 65) and is full of
    GOTO 1000
    Do stuff
    1000 CONTINUE

  • @jhk940
    @jhk940 7 месяцев назад +4

    I must have missed something. The SSD (Kingston SSD SV300S37A/120G) has a maximum read rate of 450MB/s, so reading the 13GB should take 28.88 seconds minimum. wat. Can someone explain?

    • @jhk940
      @jhk940 7 месяцев назад

      Well, I guess the complete 13GB file is cached in RAM by Windows.

    • @TurtleKwitty
      @TurtleKwitty 7 месяцев назад

      @@jhk940 Yup every os keeps hot files in ram; the java one actually had a final implementation with a ramdisk instead so the ssd overhead didnt matter

  • @kennethhughmusic
    @kennethhughmusic Месяц назад

    Stalactite - remembered by "it has to hold on tight" :)

  • @kendlyduprince
    @kendlyduprince 3 месяца назад +2

    In Java it is 1.3 sec

  • @retropaganda8442
    @retropaganda8442 7 месяцев назад +1

    I just clicked on the first search engine result for the one billion rows challenge in c language and the result of the guy beats the "official" java winner.
    Not surprised.

    • @morosis82
      @morosis82 5 месяцев назад

      Not that surprising, the first result is likely to be the best linked (highest ranked) when everyone is talking about fastest implementation in language X.

  • @retropaganda8442
    @retropaganda8442 7 месяцев назад +1

    The word "buffer" CRIES for underoptimised implementation with data being copied between kernel memory and user space process memory.
    I think i'd start by doing an mmap of the whole data on disc, assuming it's already in the fs cache.

  • @StrengthOfADragon13
    @StrengthOfADragon13 7 месяцев назад

    Can't wait for the "what is your 1 billion row challenge time" question in interviews. (Actually though, taking a legit stab at the challenge for myself sounds super fun and I really wanna see if work will greenlight letting me work on it as part of my training hours)

  • @evergreen-
    @evergreen- 7 месяцев назад +5

    This video gives me huge flashbacks

  • @mikejohnstonbob935
    @mikejohnstonbob935 7 месяцев назад

    Devin's out there taking notes. This whole article is honestly like an AI overtraining on a specific dataset. Its language capabilities even degrades as it reaches the its max context window

  • @RenThraysk
    @RenThraysk 7 месяцев назад +4

    Unfortunately produces corrupt data. If run it multiple times over the same 13Gb dataset, it'll produce a different result each time. Some temperature values end up in the 10s of thousands, and also new locations appear. Signs of race/memory corruption issues.

    • @anon1963
      @anon1963 6 месяцев назад

      What? Your program or the program in the video?

    • @RenThraysk
      @RenThraysk 6 месяцев назад +1

      @@anon1963 The solution in the video.

    • @anon1963
      @anon1963 6 месяцев назад +1

      @@RenThraysk ah ye, they probably ran finished program once and were like: "good enough!"

  • @SimonBuchanNz
    @SimonBuchanNz 7 месяцев назад +14

    "mutex is a spin lock" technically mutex is just the semantics, not an implementation, and there's a few ways to do it, with different trade-offs.
    They generally *start* with a spin lock, but that's just an optimization assuming the lock time is short. They then need a way to put the aquiring thread to sleep, and there's a bunch of ways to implement that. You can do it in user space with just thread sleep and wake functions, which can be good for "fair" locks, but you can also use events or explicit kernel mutexes, which might be better for thread residency.

    • @Kane0123
      @Kane0123 7 месяцев назад +4

      I’m going to give you a like based purely on the amount of text. I’m happy for you though, or sorry that happened.

    • @rawallon
      @rawallon 7 месяцев назад

      technically, anything is just the semantics

  • @MikePaixao
    @MikePaixao 7 месяцев назад +1

    I remember having to parse 600TB databases in the gamedev industry, I ended up using python and the windows copy buffer to just snapshop the file into memory

    • @dv_xl
      @dv_xl 7 месяцев назад

      Interesting , have a few questions.
      Obviously it can't loat 600TB into memory at once, did you chunk your reads or were the underlying DB files split up naturally?
      Were you using a network file system?
      Did you run multiple processes and map/reduce or just a single process? I'm curious how long it took in either case

    • @MikePaixao
      @MikePaixao 7 месяцев назад

      @dv_xl the first layer was using perforce, so any previous work or code could compare against cached version of all unchanged files locally synced
      Next you need to break up Parallel loops based on file types, ascii files are super easy to write regex logic (think file mirroring) I would quickly build a list of all file dependencies (if I was parsing a game map, I listed all the models, if it was a 3d model, it connected what maps and textures used it etc etc...
      Now for the copy trick, depending on file size, when having to parse through larger 1gb+ files you can choose to either copy an entire folder or individual files, and binary format you need to do the painful thing of writing a custom binary parser for the now copied into memory data
      I remember back on wolfenstein a couple of times having to checkout the entire repo because German lawyers were like "nein! You cannot have any file names with verboten naming on disk" and when you need to edit file names across an entire project that is weeks away from gold master.. not a lot of wiggle room :P

    • @MikePaixao
      @MikePaixao 7 месяцев назад

      ​@@dv_xl So the data was all stored in perforce, so I would store a snapshop with a perforce timestamp, so I could choose a chached or fresh mapping
      depending on folder/file size, sometimes you could copy entire folders to parse through larger files... it really depended on file types or single files at a time with custom binary interpreter. so you could skip entire sections of files and pull out relevant info (I was tracking all assets, where they showed up in engine or in a map and then all the related textures, models, audio etc..) It was a reflection system across data formats :P
      All done in parallel, and a weird reason to do batches of folders and not file by file is the limited number of threads python would spin up before hitting some per machine arbitrary number of threads windows can keep track of :P (also, early exist everywhere, I don't need to parse a 3D models vertices, or the animation sequence in a skeleton!)
      At some point I was checking out the entire project because german lawyers were like "Nein! Verboten! you cannot have nazi named file folders on the shipped disc"
      "but it's wolfenstein?" -> glad I added the "find and replace" option so I could do mass edits while it was parsing through :D
      timing I had it under around a few seconds, under 1s if the perforce cache existed (db was stored as sql file with no read/write locks in perforce)

  • @TurtleKwitty
    @TurtleKwitty 7 месяцев назад

    The mighty stalags rise, while the other stalags hold tight is my way of remembering which is which hahah

  • @burkskurk82
    @burkskurk82 7 месяцев назад

    Prime, what about Redis changing licensing model and Garnet (by Microsoft) written in C# outperforming Redis in C++. Help us make sense of it.

  • @ytdlgandalf
    @ytdlgandalf 7 месяцев назад +3

    These times are too good tobe true. Heavy caching through pagecache. He should flush pagecache before every try. 13GB in 1.96 =~ 6.5GB per second. No way in hell with the mentioned ssd. Flushing cache for honest numbers on the same system is benchmarking 101. Did he ever run the java implementation on his own system to set a baseline or did he just take the other benchmaker's results? Do people even know how to benchmark?

    • @arden6725
      @arden6725 7 месяцев назад

      why would you want a software optimization benchmark to be limited by your disk speed, that’s literally pointless

    • @ytdlgandalf
      @ytdlgandalf 7 месяцев назад +2

      @@arden6725 why? For reproducibility. His results could now easily be skewed from run to run if for example chrome is having a bad day and is filling his memory and thereby flushing his oagecache during some runs but not others. If you are unaware of this you make wrong conclusions on what changes made your program faster or not. If you want to take ik out of the equation than the benchmark should've stated to use a ramdisk or generate the data in-process

    • @javierflores09
      @javierflores09 7 месяцев назад

      @@ytdlgandalf this kind of code isn't meant to be run within a workstation but a server, meaning it'd be the able to take full advantage of the machine. When it comes to a workstation, all of these low-level impl will fall short behind the general impl because there's no way to predict the amount of resources the environment is willing to give this program in question in order to complete it at the fastest time possible.

    • @ytdlgandalf
      @ytdlgandalf 7 месяцев назад

      @javierflores09 this is about reproducibility. Doesn't matter if its your workstation or a "server".

  • @CipovPeter
    @CipovPeter 7 месяцев назад

    i an wondering why you need mutex when reading from file. why not open file x times for reading ? and using seek start reading from right position ? right positions can be computed in main thread at the beginning. sort of index. did not test ot but suppose ut would remove a lot of merge logic from the end of article

    • @Jesse_Carl
      @Jesse_Carl 7 месяцев назад

      I was also wondering this

  • @fuzzy-02
    @fuzzy-02 7 месяцев назад

    Renato Pereira alone sounds like a cool secret agent driving a very fast classical car

  • @michelvandermeiren8661
    @michelvandermeiren8661 7 месяцев назад +6

    Java has proven to be the fastest lang on earth with this challenge ! No other lang can compete

    • @dv_xl
      @dv_xl 7 месяцев назад

      Firstly this statement is inherently false, it can never be as fast as the fastest asm or c. But more importantly, where did you get that idea? I looked up the results for Java from the test and they were 6 seconds. It's not clear what the hardware used for the testing was, but it doesn't look to me like there's a good cross language comparison table anywhere

    • @michelvandermeiren8661
      @michelvandermeiren8661 7 месяцев назад

      @@dv_xl fastest java took 1.4 sec

  • @sifi_crafter961
    @sifi_crafter961 4 месяца назад

    Easy way to know of stalagmite or stalagtite, the M is pointing upwards

  • @Wielorybkek
    @Wielorybkek 7 месяцев назад +1

    I don't get it, the File Read Buffer took only 0.98 s!!!! Why everyone is ignoring it!!!

  • @Alguem387
    @Alguem387 7 месяцев назад +2

    MMAP?

  • @ReedoTV
    @ReedoTV 6 месяцев назад

    They should have used their "4.7HGz" PC to run a spell checker

  • @yante7
    @yante7 7 месяцев назад +1

    16:00 flip did NOT take that out

  • @absurd0000
    @absurd0000 7 месяцев назад +2

    Flip, more like Slip, cuz he be slipppppin

  • @issacwessing4945
    @issacwessing4945 7 месяцев назад +1

    I'm having some problems solving this in HTML

  • @thatmg
    @thatmg 7 месяцев назад +1

    PORTO MENTIONED!

  • @parikshitpatil1421
    @parikshitpatil1421 7 месяцев назад +3

    I guess best java solution used mmap.

  • @Sw3d15h_F1s4
    @Sw3d15h_F1s4 7 месяцев назад +2

    someone should do the 1 billion row challenge using vim

  • @dand4485
    @dand4485 7 месяцев назад

    I'm thinking one way to convert the temp (float) is have a hash map for all 100 possible different values i.e. map("99.9") simply return 99.9....

    • @imaymakesomevids
      @imaymakesomevids 7 месяцев назад

      There are 2000 values, cos of the decimals.
      The hash and lookup would be a lot slower than just parsing the numbers directly.

    • @retropaganda8442
      @retropaganda8442 7 месяцев назад +1

      Don't hash it! Just make a 2000 element array, use the raw bits as an index, and it's gonna be fast.

  • @soggy_dev
    @soggy_dev 7 месяцев назад

    I actually prefer specific syntax for multiple return parameters 🤷‍♂️ The language is almost certainly creating an anonymous struct under the hood anyway, so I'd rather it be more obvious they're connected/contiguous. Plus you have the option of passing around the entire tuple or destructuring into the components depending on what's the most convenient which just seems objectively better to me. I love go but that's up there with lack of sum types on the list of things that bother me

    • @aurele2989
      @aurele2989 6 месяцев назад +1

      we do a little struct { int a, b, c; } fn(int in) { /* ... */ return (typeof(fn(0))){ a, b, c }; }

  • @thekwoka4707
    @thekwoka4707 7 месяцев назад

    Probably could do pretty fast with Bun. Bun.file has some good ability to read file partials, so you could see how big the file is, spawn a ton of threads and handle only the parts for each....
    JavaScript does also have cool things like SharedArrayBuffers that could enable some more low level style memory control...

    • @marcomassa84
      @marcomassa84 7 месяцев назад

      I got the 1BRC down to 5.5 sec with nodejs. Bun has a bug with highwatermark option that make it less performant than node (at least in my test)

    • @anon1963
      @anon1963 6 месяцев назад

      remember about Amdahl's law

  • @MichaelSalaverry
    @MichaelSalaverry 7 месяцев назад +11

    One billion comments, lets go!

  • @pylotlight
    @pylotlight 7 месяцев назад +3

    does flip even watch the videos or just use the markers seeing he misses every cut request ;p

  • @Tony-dp1rl
    @Tony-dp1rl 7 месяцев назад +1

    I still don't understand how these BILLION row challenges are not entirely IO limited ... I mean even in JS, how to you spend more CPU time than it takes to read that much data? :/

    • @Tresla
      @Tresla 7 месяцев назад

      This is my question. How are they getting millisecond solutions? What are they running on? My NVMe drive tops out at around 1500MBps, so I couldn't even process the file in less than 10 seconds...

    • @Musikvidedo
      @Musikvidedo 3 месяца назад

      ​@@Treslagets cached in ram

  • @terribleprogrammer
    @terribleprogrammer 7 месяцев назад +1

    Java did it very fast with Graalvm native compilation. Not with JVM. Graalvm is very interesting.

    • @dragoncommands
      @dragoncommands 7 месяцев назад +1

      Graal doesnt differ that much. One of the main principles behind Graal is AOT which is able to practically knock the start-up period of java programs out of the water.
      However what graal gains in performance increases it sacrifices in things like meta-programming. Something like reflection is blocked out and becomes impossible at runtime.
      That said graal is an impressive piece of software written in java that only gets better with time.

  • @Kane0123
    @Kane0123 7 месяцев назад

    I’m waiting for a cloud vendor to suggest just running all billion in serverless - scale up to what you need to scale down when you’re done bro, e.z.

  • @BoominGame
    @BoominGame 7 месяцев назад

    2 business days: from Friday to Monday.

  • @thekwoka4707
    @thekwoka4707 7 месяцев назад +1

    forEach is faster than boomer loops in newer versions of node and in bun.
    Pretty wacky, but true.

    • @ThePrimeTimeagen
      @ThePrimeTimeagen  7 месяцев назад +1

      Actually not true
      This test was done in 20.x, 18.x, and 16.x
      By the very definition they cannot be faster. They can be of equal speed if extremely clever compiler stuff happens.
      This would require jit to take place as well

    • @lucsoft
      @lucsoft 7 месяцев назад

      ​@@ThePrimeTimeagen Mmmh i tested NodeJS 21 and actually found it was faster:
      const array = Array.from({ length: 1_000_000 }).fill(1);
      time = performance.now(); array.forEach((e) => e); console.log(performance.now() - time);
      // run was between 10 - 14ms
      compared with
      time = performance.now(); for (e of array) { e; }; console.log(performance.now() - time);
      // run was between 14 - 20ms
      Wonder why its faster

  • @bluecup25
    @bluecup25 7 месяцев назад

    Prime, do it. Just do it.

  • @JackClawson
    @JackClawson 7 месяцев назад

    Boomer loops sounds like a great cereal, now with fiber.

  • @sebastianwapniarski2077
    @sebastianwapniarski2077 7 месяцев назад +1

    Can anyone suggest a streamer that is as good with SWE but on the other side of the spectrum - TEMPERAMENTwise. I'm more of an Uncle Bob kind of guy.

  • @KaydotOrigin
    @KaydotOrigin 7 месяцев назад

    Would be awesome to see you do it in ts/js

  • @hosseines276
    @hosseines276 7 месяцев назад

    whoa! really enjoyed!

  • @rezyadlf
    @rezyadlf 7 месяцев назад

    2 business days got me)))

  • @Tony-dp1rl
    @Tony-dp1rl 7 месяцев назад

    forEach, map, etc. are the devil in JS

  • @olhoTron
    @olhoTron 7 месяцев назад

    Before even watching the video I'll guess the biggest gains will come from reducing allocations

  • @sedrakpc
    @sedrakpc 7 месяцев назад

    How it’s done in Java in 1.5 second? Now you have to read the java version)

    • @lazyh0rse
      @lazyh0rse 7 месяцев назад +3

      they used native GraalVM, it compiles java to machine code

    • @javierflores09
      @javierflores09 7 месяцев назад

      ​@@lazyh0rsethis wasn't the only reason, sure it reduced the time by removing the startup cost however there are many tricks that led to the 1.5 second (and even, 323ms when using all the 32 cores of the test machine instead of just 8). There is a great blog post by QuestDB that explains the tricks used in the top solutions in detail.

  • @GermanClaus
    @GermanClaus 7 месяцев назад

    KOTLIN mentioned!!!

  • @qazarify
    @qazarify 7 месяцев назад

    This cannot be true, the Kingston SSD SV300S37A is not capable of transferring 13Gb/sec

    • @Yawhatnever
      @Yawhatnever 7 месяцев назад

      Windows caches file reads in RAM when it can, so it's plausible that not all of the reads are hitting the disk

  • @maimee1
    @maimee1 2 месяца назад

    5:50 Why didn't they start with profiling?

  • @caedenw
    @caedenw 7 месяцев назад +1

    I can’t believe I have to point this out but his SSD can’t do 13GBps and so this is all coming from his page cache in RAM. Don’t expect anything close to these results if you flush the cache. In light of that, he should be seeing a much better score if implemented correctly since he has so many threads.

  • @MrWalrus3451
    @MrWalrus3451 7 месяцев назад

    Flip ain't taking it out brother.

  • @ismbks
    @ismbks 7 месяцев назад +1

    the one guy in your chat spamming "hardly know her" jokes

  • @xichenliu-w3r
    @xichenliu-w3r 7 месяцев назад

    13GB in one second? I think the ssd couldn't even be that fast, right?

  • @b0nes95
    @b0nes95 7 месяцев назад

    how can you read 13GB from disk in 1.5 seconds even :/ I need to watch the rest of the video lol, the timer must've been started while the 13GB was in mem

    • @Tresla
      @Tresla 7 месяцев назад

      RAM disk possibly?

  • @avalagum7957
    @avalagum7957 7 месяцев назад

    That Go person used tabs (8 spaces)?

    • @Yawhatnever
      @Yawhatnever 7 месяцев назад

      All Go code uses tabs. The reason it looked excessive was because the default browser styling for the tab-size property is 8 spaces, and apparently they didn't change it with css.

    • @avalagum7957
      @avalagum7957 7 месяцев назад

      @@YawhatneverOh, thank you. I didn't know that.

  • @dirty-kebab
    @dirty-kebab 7 месяцев назад

    I just want to say his ram won't run at 6,000MHz. I found the hard way, getting 128GB, and it down-clocks the rate, because AMD chips can't handle faster memory like Intel chips.
    Overall I chose AMD, but clearly there's more nuance than they all advertise 🤯

  • @rasalas91
    @rasalas91 7 месяцев назад

    flip did not take that out

  • @birdbrid9391
    @birdbrid9391 7 месяцев назад

    flip did not cut it out

  • @weakspirit_
    @weakspirit_ 7 месяцев назад +1

    i'm calling it, multithread/multiprocess overhead is going to show that his single process/thread solution is actually faster

  • @mechmaverick
    @mechmaverick 7 месяцев назад

    I just found your channel and your the dr disrespect of software, get some sunglasses

  • @amjad-se
    @amjad-se 7 месяцев назад

    Could you please do a video on Pocketbase?

  • @viktorhugo1715
    @viktorhugo1715 7 месяцев назад

    Renato Pereira is a Brazilian name soooooo...
    BRAZIL MENTIONED LWSGOOOOOOOOOO BRAZIL!!11!1!1!1!!1!1!1!11!1!1!1!1!!!!1!!1!1!1!!1!

  • @ytdlgandalf
    @ytdlgandalf 7 месяцев назад +1

    nobody is wondering how he can read 13GB in under a second? Really?

  • @pantsoff
    @pantsoff 7 месяцев назад

    Flip didn't take it out

  • @lskywalker5
    @lskywalker5 7 месяцев назад

    GOD DAMN IT FLIP

  • @valhalla_dev
    @valhalla_dev 6 месяцев назад

    "I have very little experience in these kinds of investigations"
    Me: Oh, word, he and I will be talking on the same level
    ...
    Me: Oh, shit, I understand none of this

  • @bluecup25
    @bluecup25 7 месяцев назад

    15:55 - Ignored

  • @havokgames8297
    @havokgames8297 7 месяцев назад

    Stalagmite - *might* reach the ceiling one day
    Stalagtite - holding on *tight* so it doesn't fall

  • @bhuvya11
    @bhuvya11 7 месяцев назад

    I want someone to try this in javascript 😂😂😂

  • @sebastianwapniarski2077
    @sebastianwapniarski2077 7 месяцев назад

    There are two kinds of great professionals who show of their skills: 1) will make you inspired 2) will throw you into despair. For me Prime is the second kind. But he's funny. I give him that. And him boasting about how he ruined every ones day when he got that calc test way ahead of others back in his uni times is just a proof of this.

  • @jazzochannel
    @jazzochannel 7 месяцев назад

    how can i insert a yomoma joke here, or an insult involving your mom?

  • @willembeltman
    @willembeltman 7 месяцев назад

    8:00 reason is the buffersize of your hdd/ssd.

  • @FaZekiller-qe3uf
    @FaZekiller-qe3uf 7 месяцев назад

    Joelang

  • @FrederikSchumacher
    @FrederikSchumacher 7 месяцев назад

    Gopoutine

  • @BoominGame
    @BoominGame 7 месяцев назад

    Why windows, that's gotta count for half the slow down, you want to optimise, get rid of windows.