How Much Memory for 1,000,000 Threads in 7 Languages | Go, Rust, C#, Elixir, Java, Node, Python

Поделиться
HTML-код
  • Опубликовано: 27 май 2023
  • Recorded live on twitch, GET IN
    / theprimeagen
    ty piotr!
    pkolaczk.github.io/memory-con...
    MY MAIN YT CHANNEL: Has well edited engineering videos
    / theprimeagen
    Discord
    / discord
    Have something for me to read or react to?: / theprimeagenreact
  • НаукаНаука

Комментарии • 1,1 тыс.

  • @jonathan-._.-
    @jonathan-._.- Год назад +967

    compaaring actual threads with async tasks seems kinda weird

    • @ccgarciab
      @ccgarciab Год назад +113

      And workers and a plain event loop. Terrible all around.

    • @MikyLestat
      @MikyLestat Год назад +42

      They are not the same, but having async tasks is a powerful functionality that isn't available in all languages. It is correct he wasn't comparing the same, but you could argue that he was comparing how you would achieve the same thing if you wrote it in each language

    • @lozanov95
      @lozanov95 Год назад +11

      ​@@MikyLestat Depends, because with Python you will run on a single thread, but with go for example you will use multiple threads. If you are actually computing anything this will make a significant difference.

    • @MikyLestat
      @MikyLestat Год назад +3

      ​@@lozanov95 Exactly. I think that the reason for the comparison is to get an indication of how much memory (minimally) each programming language will use to achieve the same thing. Achieving the same thing in each language is translated to using the features and constructs of each language. Python is a great language, but it isn't the fastest. The global-interpreter lock (in addition to Python being interpreted in CPython) causes it to be slow.
      Just because Python doesn't really have multi-threading, it doesn't mean we shouldn't use multi-threading/tasks in other languages and then profile the memory footprint.

    • @davidstephen7070
      @davidstephen7070 Год назад +2

      @@MikyLestat i think, this's wrong ways to compare language that only run in single thread vs multi-thread to get requirement memory to run that tasks. garbage collector have feature to queque overload thread. so fastest process means lower memory. and for tasks that have high range let say. first task 20KB, 70th task 1MB. Initial size heap higher give good response than set initial size to 50KB and re-allocate memory size. This all dependent user hardware to choose process ways or memory ways. if memory cheaper than cpu. than go memory, if cpu cheaper then choose like go or rush that re-allocator frequently

  • @thedoctor5478
    @thedoctor5478 Год назад +1817

    Using Python's asyncio for this test was the wrong thing to do. It's similar to what was done with NodeJS. Asyncio is an event loop, not a thread. Python has threading libs for threads.

    • @Kobrar44
      @Kobrar44 Год назад +92

      multiprocessing xD no need for a benchmark, it would be just atrocious

    • @nikonyrh
      @nikonyrh Год назад +59

      @@Kobrar44 Yeah just run "multiprocessing.Pool(int(1e6))" and you are good to go :D Argh I hate python, but it is still my main language.

    • @just_a_random_
      @just_a_random_ Год назад +21

      ​@@nikonyrhJust curious, why do you hate Python ?

    • @magicbob8
      @magicbob8 Год назад +63

      But asyncio is faster because pythons multithreading is so bad, so it’s what people use. And it accomplishes the same things

    • @ibrahimaba8966
      @ibrahimaba8966 Год назад +32

      this is an IO-Task so asyncio is the good solution!

  • @nunograca2779
    @nunograca2779 Год назад +724

    If I'm not wrong, C# uses a theard pool behind the scenes when using async/await and what it does is it recycles theards. That's why in the first test it was way up than the others. I think that was the threads pool being initialized with a bunch of threads.

    • @dziarskihenk8798
      @dziarskihenk8798 Год назад +31

      this.

    • @3ventic
      @3ventic Год назад +50

      Yup. It always allocates a fixed size pool of managed threads depending on the system it's running on, unless you set the size yourself, which is possible and would be separately interesting for this benchmark.

    • @MikyLestat
      @MikyLestat Год назад +71

      @@3ventic The ThreadPool default is much smaller, it shouldn't take 120 MB at idle. I'm betting he wasn't distinguishing between allocated and committed memory.

    • @GabrielSantAna-sm9zh
      @GabrielSantAna-sm9zh Год назад +25

      as far as I know, C# also compiles the async methods to stateful classes, so it generates the states of each “step” of processing beforehand, when you create that amount of tasks you are basically creating a list of super small instances in a queue to the threadpool to consume until the next state (await) and throw again in the end of the queue

    • @3ventic
      @3ventic Год назад +8

      ​@@MikyLestat I was a bit mistaken, but there is a fixed minimum number of threads (ThreadPool.GetMinThreads). On my system it's 32 by default and the equivalent program on my system (1 task) takes up 195M RES 108M SHR while a million tasks is using 52 threads and 472M RES 23M SHR.

  • @hansenchrisw
    @hansenchrisw Год назад +527

    As a Java apologist, it first got virtual threads in 1997 with version 1.1 (edit: later removed and recently re-added in v 19). Also, Java (and presumably.NET) pre-allocates a bunch of memory by default. Hence how mem looks high for small numbers of threads and it doesn’t increase until you hit bigger numbers.

    • @Talk378
      @Talk378 Год назад +50

      Yep, rare prime L

    • @elraito
      @elraito Год назад +24

      Yes bu ran the same code aot comüiled for c# and its only 5mb baseline. The blog author misrepresented c# badly

    • @hansenchrisw
      @hansenchrisw Год назад +26

      @@elraito no doubt, but I don’t expect someone to be proficient at all those langs/runtimes.

    • @giuliopimenoff
      @giuliopimenoff Год назад +5

      That's why they should have used Kotlin coroutines

    • @mishikookropiridze5079
      @mishikookropiridze5079 Год назад +1

      ​@@elraito That's the variation introduced by running it locally.

  • @Deemo_codes
    @Deemo_codes Год назад +226

    Each elixir process spawns with a 50k heap, garbage collection happens on a per process level (you dont stop the world, you stop a process). This is because the way processes are used in elixir is like how microservices are used. Each process does a small amount of stuff then sends a message on to another service.
    The erlang vm that elixir runs on will launch 1 scheduler per cpu and does pre-emptive multitasking. So if you had 1mn processes doing stuff you would get each process executing for a few ms then being switch out and added back into the queue that the schedulers pull from. So if you have more cores you get more parallelism, if you only have 1 core you still get concurrency.
    Whereas async runtimes tend to be cooperative require some form of explicit yielding from a running task, elixir will just swap stuff out. Makes it good for soft realtime stuff, if you want to do cpu intensive things you can delegat to NIFs (native implemented functions) written in C or Rust. The rust ones tend to be safer since panics are caught and raised as errors in elixir. Wheras a panic in C will crash the whole VM

    • @Overminddl1
      @Overminddl1 Год назад +17

      You can also specify the memory usage of a process as well on the beam VM, this significantly reducing the amount of memory something will use whenever it's spawned and doesn't really allocate anything, like in this case

    • @madlep
      @madlep Год назад +17

      And to do a test closer to what some of the other runtimes are doing, just call :timer.send_after(10000, :done) a million times, and then do a loop to receive :done 1 million times. Takes about 200mb instead.

    • @genericjam9866
      @genericjam9866 6 месяцев назад +4

      Elixir / Erlang processes have far less memory by default. More like 256 bytes but depends on word size on your system iirc.

    • @nyahhbinghi
      @nyahhbinghi 5 месяцев назад +1

      really smart GC model! Elixir was very well designed

    • @nyahhbinghi
      @nyahhbinghi 5 месяцев назад +2

      I wouldn't compare it to microservices. I would just say Elixir processes are independent and don't share memory. Which really makes it unique (I don't know of another runtime like this except Node.js webworkers).

  • @shreyassreenivas4786
    @shreyassreenivas4786 11 месяцев назад +69

    Go reserves 4K of memory for each thread's stack so you could do quite a bit of work on each of those threads without incurring further costs.

    • @demyk214
      @demyk214 11 месяцев назад +2

      Makes sense

    • @-rate6326
      @-rate6326 4 месяца назад +5

      goroutines aren't threads.

    • @tablettablete186
      @tablettablete186 2 месяца назад

      ​@@-rate6326Yeah, GO actually creates all threads at startup and just assign gorourines to them.
      All of this to say: it's a thread pool lol

  • @devotiongeo
    @devotiongeo Год назад +233

    Creating a million concurrent "tasks" (or spawning processes as we call them in Erlang/Elixir) and allowing them to remain idle is one thing, while making those processes actually do something, such as each one of them having a persistent connection to a client and feeding it, is something entirely different. In practical terms, when it comes to real-time apps, the BEAM (Elixir/Erlang) outperforms all other languages by a significant margin.
    This is precisely why Brian Action and Jan Koum chose Erlang for WhatsApp after years of experience with Yahoo Messenger and Yahoo Chat Rooms. If someone hasn't had the opportunity to work with any BEAM language, the above statement may appear to them as an empty boast, and I can't blame them for that.

    • @ThugLifeModafocah
      @ThugLifeModafocah 11 месяцев назад +3

      But then this example needs to be done and showed to the world as this primeagen is reacting. I'm surprised with Elixir performance here... in a bad way.

    • @xbmarx
      @xbmarx 11 месяцев назад +39

      @@ThugLifeModafocah I'm not. Erlang processes are completely isolated. COMPLETELY. Every "task" has a separate GC, memory space, everything.

    • @szymonbaranowski8184
      @szymonbaranowski8184 11 месяцев назад +7

      ​@@xbmarxso if things crush only these things crush that's a feature itself

    • @Aaku13
      @Aaku13 11 месяцев назад +24

      The BEAM is pretty quick, but it won't "outperform all other languages by a significant margin". Ran several huge elixir services in production with lots of traffic and our Go services were much more performant.

    • @osazemeusen1091
      @osazemeusen1091 9 месяцев назад +8

      ​@@Aaku13I can agree for only CPU bound tasks. For IO bound tasks, Golang doesn't come close in performance to Elixir

  • @casperes0912
    @casperes0912 Год назад +95

    There's also the memory vs. speed tradeoff. Sometimes keeping more things in memory can also make it faster. If the managed environments that have a higher starting point in memory usage already has a bunch of kernel threads lying dormant in a thread pool that's taking up memory but speeds up spawning of threads.

    • @cakedon
      @cakedon 5 месяцев назад +9

      if my hello world doesnt use 27 gigabytes of ram i wont write it

    • @maximumcockage6503
      @maximumcockage6503 5 месяцев назад +3

      Yeah. Bun.js was priding itself on being faster than Rust in it's beta. Then when it came out and people started benchmarking it was slightly faster than rust by like a few percent, but used 40 times more memory on average.

  • @TanigaDanae
    @TanigaDanae Год назад +129

    An information that has not been said in the video is that: async functions in C# are State Machines and Tasks (are part of the Task Parallel Library and) are automatically run in thread pools. So the only internal state these async functions have is the time they need to wake up, and all Tasks could theoretically have the same wakeup time.
    I would've loved to see a C# Thread implementation. I suspect the C# compiler is optimizing redundant Tasks away since they lack any side effects.

    • @vitskr1
      @vitskr1 Год назад +10

      Thread pool has like 512 preallocated threads, hence high memory usage in idle. Tasks are actually running, but max degree of parallelism is 8 (8 threads CPU) so there is practically nothing allocate.

    • @q1joe
      @q1joe Год назад +2

      @@vitskr1 you can tune this, knowing your workload though. Some languages I feel didn’t he the best showing here as the author isn’t an expert in each one, which is understandable

    • @monad_tcp
      @monad_tcp Год назад +2

      @@vitskr1 Exactly what I suspected ruclips.net/video/WjKQQAFwrR4/видео.html . Its using the Server tuning, I think on Desktop the default is Number of Cores * 2 .

    • @monad_tcp
      @monad_tcp Год назад +4

      @@vitskr1 512 threads * 512Kb = 256MB . Its not that big of a deal for servers with lots of cores.

    • @bangonkali
      @bangonkali 8 месяцев назад

      @@monad_tcp i agree. and irl if you plan to launch 1M concurrency your probably have the RAM to match. i still don't think many people do these in a single process anyway. probably better to distribute workload to multiple servers. i recommend orleans 7 for c# devs. 😅

  • @Hallo503
    @Hallo503 10 месяцев назад +137

    C# has the lowest memory usage because it is using the threadpool, that recycles blocking threads, like when calling Task.Delay. So there aren’t actually a million threads created but rather they are queued into the threadpool. To avoid this create the threads explicitly

    • @user-qu5cc5oe2h
      @user-qu5cc5oe2h 7 месяцев назад +61

      pff... everyone knows that c# offloads 50% of tasks on Azure servers

    • @dieSpinnt
      @dieSpinnt 7 месяцев назад

      @@user-qu5cc5oe2h ROTFL.
      As a first time viewer I asked myself if ThePrimeTime is always on that level of cocaine?
      Well, its something different than other coding channels. A fresh breeze, so to say .... **g**

    • @muaathasali4509
      @muaathasali4509 6 месяцев назад

      @@user-qu5cc5oe2h free compute hack

    • @qendrimimeri8561
      @qendrimimeri8561 5 месяцев назад

      ​@@user-qu5cc5oe2h😂

    • @gregorymorse8423
      @gregorymorse8423 3 месяца назад +2

      No shit, Sherlock, all of the languages were using threadpools except Java and Rust with real worker threads. So you've failed to uniquely qualify C# altogether.

  • @ThePhoenixProduction
    @ThePhoenixProduction 6 месяцев назад +159

    Where is c++?

    • @ErickBuildsStuff
      @ErickBuildsStuff Месяц назад +9

      None cares😅

    • @SowTag
      @SowTag Месяц назад +117

      ​@@ErickBuildsStuffAh yes, no one cares about one of the most important and influential programming languages of all computing history

    • @InternetExplorer687
      @InternetExplorer687 Месяц назад +41

      @@SowTagid argue that C is more influential but yeah, saying no one cares about the language most used in most performance critical applications, that also need low level access to memory, is a really big stretch.

    • @jstro-hobbytech
      @jstro-hobbytech Месяц назад +3

      This guy reminds me of yongyea. Parrots other's work and makes more than the authors combined. He has no insight or original opinions or educated insight (from experiences academic or otherwise).
      I hate how people raise this guy up.
      Agreed on c++. That's my personal preference as I like the syntax being I learned it the same term I took cobol, Java (when it was new), visual basic and oop was still being defined.
      I've never worked in industry as a programmer but keep up to a middling ability.
      One thing I do know is that bullshit always smells like bullshit and this dude is full of it. People that talk during react videos do so only to fall under fair use, I see the same here transposed to a topic he is novice. Want for choice as mediocrity's excuse is no less evident than an untrained hand on display for no person's betterment or an opiate of excuse to be subject for one not turning to their purpose.
      I'm as wrong as apt to be right so there's that as well.

    • @idkwhatcouldbeavalable
      @idkwhatcouldbeavalable Месяц назад

      ​@@jstro-hobbytech I personally use Rust as it keeps some of the cpp syntax and adds on top of it to prevent common mistakes.

  • @markusn4614
    @markusn4614 Год назад +126

    That C# method has 2 extra layers, the code inside the for loop should just be tasks.Add(Task.Delay(TimeSpan.FromSeconds(10)));

    • @Eirenarch
      @Eirenarch Год назад +19

      This 👆
      They created threads to run their threads inside

    • @PetrVejchoda
      @PetrVejchoda 11 месяцев назад

      @@Eirenarch No it should not. If you did it the way you describe, the work (in this case represented by Task.Delay) would not be scheduled on TaskScheduler and would instead be done on the thread that this code is running at thus blocking it and not using CPU cores to its fullest.
      If any, it should be Task task = Task.Run(Task.Delay(TimeSpan ...)); tasks.Add(task); This would save some memory while still scheduling the work on worker threads.
      I am not sure if there would be any benefits, if you used TaskFactory and Scheduler directly, whether it would be more performant, but I highly doubt so.
      Task itself is glorified coroutine and job child. Its just a premise of an action, that can wait for other actions to complete. Task.Delay does not do anything with scheduling, or threading. It just writes a timestamp, and deposits the Task to run later, when the proper time has come. But it would not start new thread/virtual thread/Task/Coroutine. Since they are trying to figure out, how costly scheduling a new thread/virtual thread/Task/Coroutine is, this would not do the work.

    • @manpt123
      @manpt123 10 месяцев назад

      c# and you are the 2 most useless stuffs

    • @FilipCordas
      @FilipCordas 10 месяцев назад

      Also I don't see value tasks and the list doesn't have a buffer set.

    • @taqial-faris6421
      @taqial-faris6421 8 месяцев назад +3

      I was looking for this comment. Guy who created that blog clearly knows nothing since he is using chatGPT and chatGPT also knows nothing if it outputs that kind of code... But hey, even my 'senior' coworker used to write async code like that so who am I to judge.

  • @SirBearingtonSupporter
    @SirBearingtonSupporter 11 месяцев назад +12

    You actually pointed this out early on. In the Java and C# version, he uses "ArrayList" without specifying the size.
    ArrayList in both these languages hold an actual Array object. It's why the lookup time for "get" is a memory address lookup time.
    When Java needs to expand the array size, it creates a larger array that is twice the size of the current array size. I believe the default is 10.
    Java also doesn't run the garbage collector unless it needs to be run or specifically invoked with System.gc.
    Because the JRE doesn't plan ahead for your bad code, it just looks for a new place to put the object in memory, leaving all the old references that need to be deleted alone - because the GC will deal with it as needed.
    Just to recap there are several arraylist objects each holding an array of size n (below) in memory - and if the JVM is given enough memory, all 11 of these will still be there.
    So that means there are 20510 threads in memory on the test.
    While his approach to joining all the threads was barbaric, it's also the accepted answer on StackOverflow, we are not measuring the speed of the execution, just the memory of it.
    If you were not trying to measure the memory performance of threading on difference languages, I would actually give java more threads to manage the threads (parallelize stream).
    Finally thoughts,
    We aren't concerned about thread space in production equipment, we are concerned about execution time and if my entire program hangs because one calculation couldn't be done, I'm missing out on something important - it could be a trade, moving servo for a robotic (self driving cars) or producing an input for a chess game. Collecting the information that I can allows me to implement an algorithm that is capable of making educated guesses based of what was calculated.
    If we do care about thread space, we would be better off doing single threaded applications since we don't have an overhead associated with the effing cost of the thread.
    TL;DR
    Something something short equal something something int because the JVM go fast blah blah addresses blah blah blah 4. (primitive array blah blah addresses, blah blah)

  • @diadetediotedio6918
    @diadetediotedio6918 Год назад +254

    C# was the winner, clearly everybody was expecting this

    • @sanampakuwal
      @sanampakuwal Год назад +6

      yes

    • @shreyasjejurkar1233
      @shreyasjejurkar1233 11 месяцев назад +22

      Of course, kudos to .NET runtime team! 😎

    • @mattymerr701
      @mattymerr701 10 месяцев назад

      Clearly they fucked their setup
      [Insert cope here]
      To be fair, they did fuck it but...

    • @cnikolov
      @cnikolov 10 месяцев назад +7

      Running as AOT has even smaller footprint

    • @FilipCordas
      @FilipCordas 10 месяцев назад +12

      Also he wasn't using ValueTask, they reduce the memory consumption considerably. But I hate tests like this because a compiler could remove everything before the code isn't doing anything.

  • @bahtiyarozdere9303
    @bahtiyarozdere9303 5 месяцев назад +2

    Thank you for sharing and commenting on this one. I would love to see C# with AOT compile. I believe it would make a huge difference.

  • @wlockuz4467
    @wlockuz4467 Год назад +26

    It should've been "To infinity and NaN" as an homage to JavaScript.

  • @bryanenglish7841
    @bryanenglish7841 Год назад +212

    You forgot the extra Rust thread it takes to track all the bullshit drama in the Rust community

    • @Marhaenism1930
      @Marhaenism1930 Год назад +14

      oopsy! is it new feature of crablang in 2023?

    • @BlackistedGod
      @BlackistedGod Год назад +4

      dammit why did I laugh so hard on this

    • @JensRoland
      @JensRoland 11 месяцев назад +13

      The Rust forums are just clogged with unproductive / outdated discussions that lead nowhere and make it harder to get anywhere as a community. The mods should simply go through all the threads once in a while and nuke the ones that are no longer relevant or helpful so the good stuff can get more space and everything would run smoother. Maybe they could even automate this with an LLM agent? They could call it “RustScheduledGarbageRemover”

    • @juniuwu
      @juniuwu 4 месяца назад +6

      @@JensRoland Garbage Collector? BAN

    • @JensRoland
      @JensRoland 4 месяца назад +8

      @@juniuwu banning people is just garbage collection for communities ;-)

  • @chigozie123
    @chigozie123 5 месяцев назад +5

    The go results are not surprising. It's a well-documented feature that each goroutine starts with an initially pre-allocated stack size. Prior to go 1.2, it was 4kb, then it went to 8kb, and I believe it's now at 2kb for go 1.4+.
    So 2kb × 10k means an additional 20mb on start. At 100k, it means a minumum of 200mb on start.
    The math seems pretty consistent with the results we see for go, although they seem to suggest that initial stacksize may be closer to 2.7kb than 2kb.
    We also have to keep in mind that there is a garbage collector running in there, and we didn’t account for how much memory it requires to keep track of everything going on.

  • @igordasunddas3377
    @igordasunddas3377 Год назад +49

    Man I am allergic to empty catch blocks in Java - always. After looking for exceptions that have never been rethrown or really handled, I am really on the fence. Empty catch blocks should not exist or even be allowed...

    • @gregorymorse8423
      @gregorymorse8423 3 месяца назад

      You are allergic to using your brain, yes we know. Maybe if you knew what checked and unchecked exceptions are and stopped making dumb comments. This is why you should stop the drugs and go back to school, fool

    • @albertmagician8613
      @albertmagician8613 Месяц назад

      I have no problems with empty catch blocks, as long as my compiler is allowed to optimize them away.

  • @NameyNames
    @NameyNames 9 месяцев назад +26

    As likely already pointed out, C# uses a thread pool, and will definitely not create a gazillion threads in this test, and the memory required to house all of these insignificant tasks will be very small, which is apparent in the test results.
    I tried it out in LinqPad, but with one additional task whose only purpose was to keep track of the number of simultaneous threads actually in use. For 1 million tasks, the actual active thread count peak never even exceeded 50 on my system (usually much lower). No wonder, when all that the tasks are "doing" is async-waiting on a delay.
    This benchmark is broken in the sense that it doesn't really do what the author thinks it does, i.e. it does NOT create a lot of threads (virtual or otherwise) in all languages/runtimes, and measuring the memory usage is thus close to pointless.

  • @W1ngSMC
    @W1ngSMC Год назад +67

    To be fair, Elixir is spawning new processes with their own memory and PID (inside the VM).

    • @isaacyonemoto
      @isaacyonemoto Год назад +25

      And also providing stuff for graceful restarts and an entire message queue

    • @BosonCollider
      @BosonCollider Год назад +13

      And preemptive scheduling, if any one of them fails or blocks indefinitely it cannot take the rest down with it.

    • @sukidhardarisi4992
      @sukidhardarisi4992 4 месяца назад +1

      usage of Task.async in elixir, it comes with lot of boiler plate that is wrapped on top of GenServer. if the test has to be performed for concurrent tasks, one could go with primitives like spawn, send and receive in order to know the true potential. Just my opinion on why elixir used a lot of memory.

    • @gregorymorse8423
      @gregorymorse8423 3 месяца назад +1

      It's not doing anything. The erlang process concept has nothing to do with threading. Sure it explains the memory usage, but there are ways to pool it so a maximum amount of processes could be spawned at any time.

  • @Lyynx92
    @Lyynx92 Год назад +5

    .Net pre-allocates a thread-pool at startup though the memory shouldn't be quite that high. Pretty sure it also utilizes a work stealing scheduler under the hood for continuations and its async/.await behavior. Also if you want to further optimize for memory the ValueTask struct will do some caching cleverness to dodge Task allocations if the work is either already done or can be done synchronously. Given how simple the test is, the GC probably won't kick in as it can recycle a lot of those Task objects.

  • @metaphysicalconifercone182
    @metaphysicalconifercone182 Год назад +108

    I wonder why Kotlin wasn't included, I guess it does share similarities with Java and Go but it's implementation of Coroutines is supposed to be different from that in Go. I guess testing it would also have to include both JVM and Native compile targets because you never know.

    • @avalagum7957
      @avalagum7957 Год назад +6

      If you include kotlinx library, you should add Scala Actor, ZIO ... too.

    • @DeliOZzz
      @DeliOZzz Год назад +5

      @@avalagum7957 suspend keyword and channels are part of the standard kotlin library. Coroutines package includes coroutines' builders and stuff like flows.
      For some reason Prime just ingores Kotlin whatsoever :/ But i'd really like to watch some quality kotlin roast.

    • @sharkpyro93
      @sharkpyro93 11 месяцев назад +7

      @@DeliOZzz cause its not a popular choice for backends, alot of people still thinks kotlin is only for android, im afraid this stigma will stick around for the time being

    • @AlanPCS
      @AlanPCS 6 месяцев назад +4

      It runs in the same VM. At most it would be equal to a competent implementation in Java only.

  • @stevenhe3462
    @stevenhe3462 Год назад +12

    Elixir reserves 4kiB of RAM for each of its processes. Each process in Elixir has its own separate heap to eliminate the possibility of stop-the-world-GC.

    • @llothar68
      @llothar68 Год назад +2

      Each Linux kernel thread needs 32kb (28kb of it are non swappable physical kernel stack space) + 1kb for kernel structures.

  • @madlep
    @madlep Год назад +22

    The Elixir solution has a LOT of room to squeeze out. I can get it running in about 990mb with some tweaks. Main thing is the default heap size. Passing `+hms 1` as part of `erl` options sets default size to 1 4-byte word. Also, using plain spawn calls instead of Task (which accumulates results, and adds extra memory and GC and processing overhead) reduces it further.

    • @mennovanlavieren3885
      @mennovanlavieren3885 Год назад +4

      True, but as long as the "threads" don't actually do anything it is a useless comparison. The constructs on these platform all provide a different feature set, so comparing performance is bogus. I mean a C# Task is just one or a few objects waiting in several queues to be invoked by native threads in the thread pool with a job stealing algorithm. NodeJs and Python are single threaded with a single event loop. I don't know what the others do and give you for free, but this isn't apples to apples.
      (Edit: I automatically type thread with a capital T)

    • @madlep
      @madlep Год назад +10

      @@mennovanlavieren3885 Yup. The comparison is pretty meaningless. The "cheap", non-idomatic Elixir way to do this, would be to start 1,000,000 timers, and wait for them to finish. Effectively doing the same thing as some other platforms. I just tried that - uses about 200mb in total of memory.
      If all it's doing is starting something that sits there idly for 10 seconds, there isn't much difference.
      No point carting round a whole isolated separate stack and heap for each process, and associated house keeping. Elixir processes are cheap, but they're not *that* cheap.

  • @pinoniq
    @pinoniq 11 месяцев назад +4

    If you want node to actiually use multiple threads, you need to tell libuv to use multiple threads. There is a env variable for this: UV_THREADPOOL_SIZE . Like you said, node has an eventloop. Thats not multi-threaded. It's single threaded with callbacks. Thats why setTimeout is more a 'minimum' guideline and not precise at all (under heavy loads). Just make a busy-wait program in node and you'll see it only filling up a single core on ur CPU

  • @tofaa3668
    @tofaa3668 8 месяцев назад +2

    The issue with the java threads i feel like is not preallocating the array list, every time an arraylist gets appended it checks for the size and generates a new array. Which in this case would be a whole lot of arrays in memory for the gc to collect.

  • @woolfel
    @woolfel 11 месяцев назад +3

    back in the JDK 1.3 days, the JVM would allocate 1MB per thread, but it was changed around 1.6/1.8, I forget exactly which release they fixed that. It's also important in Java to get the memory used, not memory allocated. The biggest issue with java for me is once the JVM allocates memory, it doesn't release it until you stop the JVM process.

  • @Trekiros
    @Trekiros Год назад +27

    Intro: let's not compare apples to potatoes
    The rest of the video: compares making threads with maintaining an event queue

  • @Mentox2
    @Mentox2 Год назад +37

    9:30 - In the 19th century the german mathematician Georg Cantor proved that there must be more than one kind of infinity, such a the infinity of the natural numbers, and the infinity of real numbers and so on, and that there are larger infinities than others. The smallest infinity is that of the natural numbers, and its called Aleph Zero.
    So yes, Buzz can indeed go to infinity and beyond, so long it is mathematical infinity.

    • @ko-Daegu
      @ko-Daegu Год назад +3

      pretty cool i remember studying this part of set theory and how Alef (first alphbet in Arabic) the idea is that the set of natural numbers (1, 2, 3, ...) has the smallest cardinality and is denoted as Aleph Zero (ℵ₀)

    • @JamieNeubertPedersen
      @JamieNeubertPedersen Год назад +1

      Thanks. Was thinking the same.

    • @user-zt7gj5ff8n
      @user-zt7gj5ff8n Год назад +3

      Nothing "and so on". That is not clear. In fact it can neither be proven not disproven with standard mathematics. It is called the continuum, hypothesis

    • @mykhailonikolaichuk6392
      @mykhailonikolaichuk6392 11 месяцев назад

      @@user-zt7gj5ff8n The continuum hypothesis is that there are no intermediary infinities between "infinity of integers" and "infinity of reals". It is, indeed, but an axiom. However, the cartesian product of a set with itself ALWAYS yields a set with higher cardinality, so infinitely many distinct infinities can be constructed by the repeated usage of it.

    • @d7ffab979
      @d7ffab979 11 месяцев назад +1

      @@mykhailonikolaichuk6392 That is just wrong. Infinite cartesian products of natural numbers, for examples, are "just" rational numbers.

  • @TizzyD
    @TizzyD Год назад +5

    🤔 I concur with you Big P...let's look at some more real use cases. Going outside of the process itself will complicate analysis with other elements (e.g. DB, ORM, etc.) that should be held constant; however, there are good use cases to eliminate as much of the 7 layer stack as we can:
    1. Storage - with the good old random file manipulation, etc.
    2. Network - doing something more like a UDP listener to eliminate possible contamination with socket handling
    3. Memory - malloc, 😮multi-threaded data manipulation, release (to watch garbage collection)
    4. Compute - not all compute operations are math-based, but do some string parsing, concatenation, etc.
    I'm thinking we want to eliminate math computations because most of those operations will come down to the underlying math implementation vs. actual performance (e.g. Fortran being fast, etc.), but network issues could have the same impact. Consider the history of Java IO vs. NIO.

  • @jonstewart5525
    @jonstewart5525 3 месяца назад +1

    Since this is a Linux system it’s using the completely fair scheduler (cfs) which means each thread runs at the same priority (as apposed to the mlfq (multilevel feedback queue) that windows uses). The issue then is that the OS is processing at the same priority as each of the threads created so the computer just freezes up. There’s also a minimum time spent in each thread so you rarely get to execute an action.

  • @Jmcgee1125
    @Jmcgee1125 Год назад +19

    15:11 Python, by default, only uses one worker thread. When writing asyncio code you do need to be careful that you don't block. My understanding is that each event loop may have only one worker, but I'm not experienced enough to be confident in saying that.

    • @ShaneFagan
      @ShaneFagan 3 месяца назад +1

      To expand on this a little more for people:
      1. They used asyncio which is just an event loop, there is no threading, just a loop that does the tasks in FIFO. The memory usage would be just the amount that stores the task information/statuses, it wouldn't have overhead from spawning threads
      2. Virtual threads in Python are in the threading module. They are limited to one core but can run in parallel and independent as you would expect from a thread.
      3. For proper hardware threads you have to use multiprocessing and it works very similar to other languages that use fork but with the added stuff like the ability to spawn a thread pool for batch processing and maybe limit the amount of threads to a number that wouldn't cause stability issues on the system.
      Also in Python3.12 there are some interesting changes related to the GIL which change how concurrency works in general with the ability to run code in basically another instance of Python. That will change mega high performance Python concurrency quite a bit in the future but as of right now it's one of the 3 above I described. Just note the blog post he is talking about is 1 which isn't parallel.

  • @baxiry.
    @baxiry. 11 месяцев назад +9

    There is some important information not mentioned in the article
    Goroutines are compared to threads, either real or virtual.
    It is not compared to event loop
    Go has event loop libraries
    As long as the author of the article used the event loop in other languages, he should use it in Go as well in order for the comparison to be unbiased.
    Other information:
    The advantage of goroutines over threads is that it is portable. It does not depend on the operating system. If your application requires on-the-metal operation such as chips or microcontrollers that do not have an operating system, a goroutine can be run.
    With threads it is not possible. Because the language is not the one who does the job but the operating system. And where there is no operating system, there are no threads.
    One last thing
    When an application uses system threads, the system will reserve memory. The question is: Did the author of the article calculate the memory reserved by the system ??

  • @dipi71
    @dipi71 Год назад +3

    Erlang, a language used in telecommunications, still seems to be the concurrency champion (according to a book by Röhrl and Schmiedl called »Produktiver programmieren«, I've read it in German a while ago).

  • @andzagorulko
    @andzagorulko Год назад +14

    C# has threads. Benchmarking Tasks instead is just confusing, because those aren't theads.

    • @pavelyeremenko4640
      @pavelyeremenko4640 Год назад +3

      As you may have noticed, he's benchmarking green threads(tasks in c#, goroutines in go, etc.) across the languages.

    • @carlinhos10002
      @carlinhos10002 Год назад +5

      C# does not have green threads. Tasks are not green threads

    • @pavelyeremenko4640
      @pavelyeremenko4640 Год назад

      ​@@carlinhos10002 Now that I've re-read the definition of green threads, I'm not sure how they aren't. They are not OS managed. They are lightweight thread-like primitives managed by the runtime. What are they missing?
      Wikipedia also lists them as such on en.wikipedia.org/wiki/Green_thread
      Not sure if this is as important though, every language in the lists was using their concurrency primitive built on top of some managed pool anyway.

    • @metaltyphoon
      @metaltyphoon Год назад +1

      @@pavelyeremenko4640 he’s just making things up. Most implementations are using some abstraction over OS thread. Only one of Java and Rust versions dont do that.

    • @zephyrprime
      @zephyrprime 25 дней назад

      C# tasks use a threadpool to execute. But one thread can have multiple tasks waiting simultaneously and the code this guy used had each thread sleeping for several seconds

  • @peppybocan
    @peppybocan Год назад +43

    So this article is definitely comparing apples to oranges - light threads/proper threads and runtime limitations.
    Go has support for parallelism, but it will only allocate as many threads as there are CPU processors (see GOMAXPROCS env variable) and on those the runtime scheduler runs these tasks.
    Python with its notorious GIL (Global Interpreter Lock) is the main bottleneck, though not visible in this flawed benchmark, as the threads themselves are not doing anything, this looks fine until you actually need to run some code. So Python would very likely burn in throughput benchmark, regardless of the number of threads. (See Python's sys.setswitchinterval).
    NodeJS, as The Prime mentioned, again, massive event loop and timers on it. If you do a computationally heavy work on it, your one poor CPU will go into early retirement....

    • @daasdingo
      @daasdingo Год назад +1

      The article was using the single-threaded event loop in Python.

    • @peppybocan
      @peppybocan Год назад

      @@daasdingo still wrong though.

    • @mennovanlavieren3885
      @mennovanlavieren3885 Год назад +2

      I concur. With IO heavy tasks the NodeJs event loop is okay, and keeps your programming model simple. With computational work you need to use workers on NodeJs as per NodeJs documentation itself. And even with IO tasks you should not use one Node process on a gazillion core machine.
      Also, not all light thread implementtions (hate the word green in this context. Green, in practice, means illogically wasteful in the name of virtue signaling) offer the same features out of the box.

    • @ddomen9488
      @ddomen9488 Месяц назад

      ​@@daasdingoalso in nodejs since promises are not actual threads

  • @om3galul989
    @om3galul989 11 месяцев назад +3

    yea node example is not spawning threads, it's just placing tasks on the timeout callback queue of the eventloop to be executed later using the main thread.

  • @misterkevin_rs4401
    @misterkevin_rs4401 11 месяцев назад +2

    C# Uses a thread pool behind the scenes with a default config of #X amount of threads depending on the system it's running, it's usually 20 if I remember correctly from my .NET days. What's interesting to me is how it can spin up more if required and scales correctly.

    • @FilipCordas
      @FilipCordas 10 месяцев назад +1

      Should be equal to number of cores you have available on the machine.

  • @mattymerr701
    @mattymerr701 10 месяцев назад +1

    C# uses loads of thread pools and I think the issue is they likely didnt trim the assemblies etc so it kept a bunch of unused crap

  • @iforgot669
    @iforgot669 Год назад +14

    C# now has native aot and would have significantly improved the memory footprint of this

    • @SurvivalGamingyt
      @SurvivalGamingyt Год назад +8

      Yeah, 7,4mb for just a standalone release mode app.

    • @sgbench
      @sgbench Год назад +1

      Also trimming

    • @FilipCordas
      @FilipCordas 10 месяцев назад +1

      @@sgbench ValueTasks and adding a buffer size to the list will help.

    • @CeleChaudary
      @CeleChaudary 4 месяца назад

      @@FilipCordas That's a good point

  • @Bourn77
    @Bourn77 Год назад +58

    C# master race. Lets go.
    .NET team is optimizing the fu*k out of the stack for a few years.
    Hands down the best api backend language to work with. 🥰

    • @reddragon2358
      @reddragon2358 Год назад +7

      I hope that it become so good that it could be perfectly used for full stack language.

    • @BosonCollider
      @BosonCollider Год назад +2

      @@reddragon2358 It does work fairly well together with HTMX

    • @reddragon2358
      @reddragon2358 Год назад

      @@BosonCollider Oh, glad to hear, but for example with Java could be used for full stack development with the help of Java frameworks.

    • @mishikookropiridze5079
      @mishikookropiridze5079 Год назад +3

      @@reddragon2358 That produces horrendous UI. Could be future using WASM.

    • @reddragon2358
      @reddragon2358 Год назад +2

      @@mishikookropiridze5079 I heard that C# has UI frameworks. I hope that the get better with time.

  • @rahulagarwal968
    @rahulagarwal968 11 месяцев назад

    For building the backend for a Flutter application or any frontend. Which server side language will you prefer : Go or Node js ?

  • @Overminddl1
    @Overminddl1 Год назад +1

    I'm also curious how OCaml's task library would go, as well as rust using a future joiner instead of full tasks just for curiosity, lol

  • @robfielding8566
    @robfielding8566 11 месяцев назад +16

    Go is definitely not a memory hog; at least for IO-intensive tasks. The main thing is that the Go libraries are always very careful to stream large inputs; rather than buffer them in memory. Java itself doesn't really have major memory issues beyond spawning threads; but in any large Java project, the code will be full of things being buffered into arrays, rather than being streamed. I tried rewriting netty to make it stop doing dumb things; and just switched (permanently) to Go. Part of Java's program is also the legal issues of shipping a JVM; and the existence of Oracle thumb-breakers and lawyers; to come punish you for shipping.

  • @quachhengtony7651
    @quachhengtony7651 Год назад +10

    C# fan bois are eating good these days

  • @blowfishfugu4230
    @blowfishfugu4230 7 месяцев назад +1

    just for fun, did creating threads in c++ in a similar fashion:
    static std::atomic toInc = 0;
    {
    std::vector threads;
    for (int i = 0; i < 1'000'000; ++i)
    {
    threads.emplace_back(std::jthread{ []() {
    toInc++;
    } });
    }
    }
    running on a cpu providing 8 cores it took endless (we're talking bout 15minutes) to allocate thread-handles,
    resulting maxmemory consumed was 75MB.
    deallocating the thread-handles took the same amount of time creating them.
    so. this testcase highly depends on what kind of platform/OS is in use.
    Also it's not advised to use more threads than your hardware can handle on native cores,
    on my system the highest multithread-performance was
    on 32 threads (including an if < 1'000'000 inside each thread's lambda).
    and the peak-performance for the simple task was on singlethreaded (guess because no locking on atomic was necessary)
    --- everything just observations and measurements

  • @nelsonoussahsigha1300
    @nelsonoussahsigha1300 Год назад +1

    yes he could've use worker to create thread for concurrent task, by using settimeout you're still mono thread so all those setimeout will be queued inside the callback queue

  • @smallfox8623
    @smallfox8623 Год назад +82

    i'm ready for the C# arc let's go, it has a really bad reputation that is totally undeserved these days

    • @reddragon2358
      @reddragon2358 Год назад +6

      True.

    • @MH_VOID
      @MH_VOID Год назад +1

      My personal hate for it came from the pain of trying to use it in my SW dev course on linux compared to those windoze fags who have first class support for everything, and from missing a bunch of the things I love about Rust when doing C# (e.g. immutable by default, f, u, i (though byte is fine and I guess using "long", "short", etc. isn't really bad. more just personal preference and more efficient), match, traits, enums, macros! True some of these stuff are to a decent extent available in C#, but the.. culture doesn't use them primarily like Rust does). But the language itself genuinely looks pretty nice, and has some nice features and shit even over Rust. I'm definitely comfortable calling the language "better Java", and would be okay programming in it professionally or even hobbyistically.

    • @reddragon2358
      @reddragon2358 Год назад +1

      @@MH_VOID Yeah. Rust is very intriguing language (excluding the dramas and BS). Also things should be a lot better than before. Although there still is some windows/Microsoft bias in the language.

    • @sohn7767
      @sohn7767 Год назад +18

      I think C# is great honestly. Not the best in anything, but it’s good in many areas

    • @reddragon2358
      @reddragon2358 Год назад

      @@sohn7767 Yeah agree. And I think that it is its main strength. That it can be used for everything.

  • @kooraiber
    @kooraiber Год назад +13

    My man hates C# so much, it's hilarious! To be fair though I agree with everything you said and would love to see your benchmarks about this topic.

    • @sanjayidpuganti
      @sanjayidpuganti Год назад +15

      ​@@cethienI love C# but hate MS. I use Rider and Linux to code in my personal time and I like it. I think it's very good for API development.

    • @DaddyFrosty
      @DaddyFrosty Год назад +4

      @@cethien VS sucks, Rider rules. I do also hate Microsoft but it’s a good language nonetheless

    • @pavelyeremenko4640
      @pavelyeremenko4640 Год назад +1

      @@cethien I've been developing c# on linux and macos for a couple of years now using Rider (I just like it more but the Visual Studio is also fully cross platform).
      I don't personally enjoy the language as much nowadays but the tooling is great whatever platform you pick.

    • @DaddyFrosty
      @DaddyFrosty Год назад

      @@pavelyeremenko4640 last time I used visual studio on mac it was only for Xamarin

    • @ko-Daegu
      @ko-Daegu Год назад

      @@cethien I loooove writing Razor components 🤓
      // MyComponent.razor
      @using Microsoft.AspNetCore.Components
      @Title
      @Message
      @code {
      [Parameter]
      public string Title { get; set; }
      [Parameter]
      public string Message { get; set; }
      }
      the fuck is this shit

  • @paklenizmaj
    @paklenizmaj Год назад +1

    I believe that in the java example, the program will "block" on the first unfinished thread, and when that thread finishes and the dispatcher returns execution to the main thread, the for loop will "fly" to the next unfinished thread and then hand over execution to the next thread.
    As the dispatcher flags the thread when it is finished, the join method simply switches (do not block, just switch) the thread if the finished flag is false. So there is no penalty.

    • @RichardKures
      @RichardKures Год назад

      The code in java could be done much better:
      try (varexecutor = Executors.newVirtualThreadPerTaskExecutor()) {
      for (int i=0; i {
      try {
      TimeUnit.SECONDS.sleep(10);
      } catch (InterruptedException e) {
      Thread.currentThread().interrupt();
      }
      });
      }
      }

    • @paklenizmaj
      @paklenizmaj Год назад

      ​@@RichardKures Thread pools are great if you don't need long running tasks, if you need long running sockets or drawing gui in a loop you need to use raw threads. It's not just for Java but for any language. Thread pools create a small number of threads and when a task completes, the new task merges with the previous one, so there is no execution on the new task until the first task completes.
      Thread pools are for (parallel) computation, not for long-running tasks.

  • @sciencefirefly837
    @sciencefirefly837 6 месяцев назад

    Does it also not depend on the type of task which is executed? Usually, it should be some validations and a CRUD in DB.

  • @autismspirit
    @autismspirit Год назад +56

    tbh the C# number kind of makes sense, it scales incredibly well, especially in later .NET versions. Some C#-based fancy Unity optimizations can beat out GCC in raw speed and memory.

    • @autismspirit
      @autismspirit Год назад +6

      Granted, there is probably some optimization going on in Release mode, since it's not doing anything. I'd expect the memory consumption to be higher, but not 4GB high.

    • @marcossidoruk8033
      @marcossidoruk8033 Год назад +11

      What do you mean by "beating GCC" last I checked GCC was a compiler.

    • @CorvinhoDoMal
      @CorvinhoDoMal Год назад +6

      ​@@marcossidoruk8033 yeah, the optimizations are made by the compiler. He meant the C language, but specifically with GCC. If you used the microsoft compiler or other options you would have different performances.

    • @marcossidoruk8033
      @marcossidoruk8033 Год назад +16

      ​​​@@CorvinhoDoMal No way C# is going to beat carefully written C code in any imaginable benchmark ever, its just impossible.
      Plus what he said makes no sense, "unity optimizations" how do you compare C# unity performance with C unity performance if you can't do unity scripts in C? Am I going crazy or what.
      And if he means the engine that is written almost in its entirety in C++

    • @janus798
      @janus798 Год назад +13

      @@marcossidoruk8033 Google the Unity Burst compiler. Faster than GCC in fibonacci and NBody simulation.

  • @R4ngeR4pidz
    @R4ngeR4pidz Год назад +28

    You're 100% right about the complexity of the task.
    But also, I would have stopped reading after they said they used ChatGPT to come up with the code.
    You need to have these contributed by people that actually write this language and that actually understand this language.
    The ambiguity between what the code was actually doing in all of these was horrible, as other commenters have also pointed out.

  • @Hector-bj3ls
    @Hector-bj3ls Месяц назад

    In Rust, the default stack size for an OS thread on all tier 1 platforms is 2MB. Not sure if it's allocated up front, but that's probably something to do with when all the memory went.

  • @nyahhbinghi
    @nyahhbinghi 5 месяцев назад +2

    If you are creating a new Elixir "process" per task it will scale up pretty linearly with the number of tasks, hence why it's high. High memory usage is not really a bad thing, perse. Likewise, the same with Go and goroutines, whereas other runtimes with a fixed threadpool or Node.js with it's single event loop won't keep climbing linearly. I would be more interested in CPU usage. You're welcome for this insight! 🤜🤛

    • @pdgiddie
      @pdgiddie 4 месяца назад +1

      This. The BEAM VM was designed to prioritise latency and predictable scalability. Copy-on-write and other memory consumption optimisations can produce latency spikes.

  • @casperes0912
    @casperes0912 Год назад +12

    I will most likely need to use C# as my primary language at my next job

  • @_daniel.w
    @_daniel.w Год назад +5

    I'm curious about C, C++ & Zig.
    Also, I love Go. What happened, why did it end up using so much memory? Kinda sucks

    • @_daniel.w
      @_daniel.w Год назад

      @nósferratu Oh, alright.
      I was watching chat go by and someone mentioned Go is stackbased or something along those lines.
      Thanks for the info 👍

    • @hvaghani
      @hvaghani Год назад

      ​@nósferratu right I was going to comment the same and found this

    • @scotter7663
      @scotter7663 Год назад

      The C# implementation is completely bogus compared to the others. It's using a small thread pool (task.run) to set a bunch of timers (task.delay) that's why it shows low memory usage. This is not demonstrating concurrency.
      If the implementation did a thread. sleep or used real threads the results would be completely different and probably worse than Java since C# doesn't have virtual threads.
      In the real world Go runtimes will have considerably less memory overhead than C# or Java

    • @scotter7663
      @scotter7663 Год назад

      ​@@_daniel.w Go has a delay() function that looks similar to what's used in the C# impl. Rework the Go implementation to use this and I suspect it will perform drastically better

  • @indramal
    @indramal 11 месяцев назад

    So what is final choose for high traffic? does it only need memory consideration? number of concurrent connection also matter.

  • @edino1981
    @edino1981 7 месяцев назад +1

    It seems to me that C# sample is not tested in the release mode but in debug mode, so memory consumption should be smaller as tasks are just light definitions of work that are executed on thread pool.

  • @3x10.8_ms
    @3x10.8_ms Год назад +25

    crab is fast and fox is slow

  • @insylogo
    @insylogo Год назад +3

    AOT and tree shaking business has come a long way with c#. I would assume actual minimums an order of magnitude or less, but he did say default release configurations.

  • @zolniu
    @zolniu Месяц назад

    In C# when you use Tasks with async/await, the default implementation creates a state machine that uses pre-existing thread pool to schedule execution of your tasks on the threads in the thread pool. Not only that, but it can even detect if the task in the thread is small enough to be executed synchronously - in that case it won't even end up in the thread pool - it will just execute and return as normal function call.
    To test how much memory threads consume in C#, you can't use Tasks with async/await - you have to use Thread class directly - that way you circumvent all of the optimalizations done in the runtime and in the Tasks scheduler.

  • @awilliamwest
    @awilliamwest 9 месяцев назад +2

    I'm sad for F#. Interesting to see PrimaGen and others re-excited about OCaml, and perhaps the 5.0 release is one reason, but I was an F# fanatic for several years, and just returned to F# for a recent small project. (I *try* to choose Rust for new projects, but got frustrated with Rust's lack of a REPL and wanted to use IonIDE in VS Code for my small project (involving parsing XLS and zips of text files); sometimes it's more about the tooling/IDE than it is the language...) C#'s good performance here makes me think F# might also perform equally well; .NET has continued to make impressive optimizations.

  • @vighnesh153
    @vighnesh153 Год назад +5

    More interested in seeing Nodejs 20 with worker threads as they claim that there is a lot of perf improvements in Node 20

  • @boredstudent9468
    @boredstudent9468 Год назад +8

    He said he launched 1 Task, as soon as you start one async task C# (in .NET 6) already sets up all the thread pool stuff and Access control. For such simple instances you should use threads in C#. Afaik it greatly improved with .NET 7. But in exchange you are prepared to scale incredibly, also yeah the .NET runtime does some incredible smart magic in the background, e.g. have a looked at LINQ performance in .NET 7.

    • @metaltyphoon
      @metaltyphoon Год назад

      CAS is not a thing anymore in dotnet core world.

    • @sgbench
      @sgbench Год назад +1

      @@metaltyphoon CAS?

    • @rroscop
      @rroscop Год назад

      Can you really run 1 million C# threads?

    • @boredstudent9468
      @boredstudent9468 Год назад +1

      @@rroscop on my hardware no problemo, remember that they are way more like go routines than like hardware threads, so only a dozen is actually working in parallel, the rest is just queued.

    • @rroscop
      @rroscop Год назад

      @@boredstudent9468 nice. Are you talking about System.Threading.Thread's? Or tasks run via Task.Run()?
      my understanding was that Task.Run() used a thread pool under the hood, but real Threads were more heavyweight. I'm not a C# developer though, just dabbled

  • @geraldmaale
    @geraldmaale 11 месяцев назад

    I am interested in finding out what tool this person used to measure the memory usage for the C# part, as these results appear to be questionable.

  • @bentels5340
    @bentels5340 Год назад

    Quick correction regarding the Java remark: virtual threads are not a preview in 21, they are done. What *is* a preview is structured concurrency, which handles thread-spawn and rejoin more elegantly.

  • @remrevo3944
    @remrevo3944 Год назад +9

    12:30 Per default tokio creates worker threads equal to the amount of cpu cores.
    Though thinking about it, if you only use timers having a single threaded runtime would likely be just as fast and more efficient.

    • @llothar68
      @llothar68 Год назад +1

      Not a good choice. You often have long running threads that also do block. In fact all the systems where the kernel is not controlling the worker threads sucks. This means: Linux,Android and the BSDs. The other systems have kernel driven thread pools for much better handling making sure that IO blocks don't prevent utilisation.

    • @remrevo3944
      @remrevo3944 Год назад +1

      ​ @llothar68 I explicitly meant that for the case of using only timers, which are neither cpu intensive nor use blocking APIs.
      When using a async runtime like tokio you shouldn't use blocking APIs anyway and if you have to there is tokio::spawn_blocking, which spawns a thread/uses a thread pool.

  • @c4ashley
    @c4ashley Год назад +6

    The name is the C-sharpagen.

  • @wdavid3116
    @wdavid3116 Год назад +1

    I don't think the thread joins are actually an issue. All that is being measured is memory. The time cost would be real but if you actually have to wait on all those threads the order shouldn't be very meaningful and to get any sort of speedup you'd need an os that supports joining multiple threads at once or you'd have to do something more elaborate to make use of some sort of multiple message capability in the kernel (maybe something with epoll?) If you're waiting on thread 0 and thread 1 quits you'll be sleeping in thread 0 while other threads use the CPU to finish and then once the thread you're joining on ends you'll burn through the finished threads and then repeat the sleep as needed. Syscalls are expensive but not *that* expensive.

  • @robertwhite3503
    @robertwhite3503 Год назад

    I was taught at primary school that infinity was the same as infinity plus one. However at college was taught about "marking". If you match all the numbers from one to infinity, then plus one is longer. So at eight years old I was being mis-taught maths?

  • @reddragon2358
    @reddragon2358 Год назад +9

    Let's go C#

  • @erickmoya1401
    @erickmoya1401 Год назад +4

    My wife says you yell too much. I tried to prove she is wrong.
    My argument didnt last a second.

  • @basimal-jawahery5688
    @basimal-jawahery5688 Месяц назад

    Awesome!! :)) extremely funny :) thanks for the video :)

  • @Zooiest
    @Zooiest Год назад

    Well, technically, JS structs can take up as few bytes as any other language, as long as you ignore the sizes of serialization/deserialization definitions and only care about the size of the ArrayBuffer you put data in

  • @quachhengtony7651
    @quachhengtony7651 Год назад +5

    Let's rewrite Elasticsearch, Kafka, and Cassandra in C# and get free performance

  • @SharunKumar
    @SharunKumar Год назад +7

    I wanna see Nick Chapsas's reaction on this 🤣

  • @RoccoWocco
    @RoccoWocco Год назад +1

    C# has a parallel for and foreach for these types of scenarios. You can tell it the degree of parallelism and it'll just do it for you. In no scenario is the way shown in the article correct. That's an anti pattern in 99% of cases. If you do want to do async in your parallel code then there are async versions of the parallel loops.
    You could also just manually make threads

  • @uncrunch398
    @uncrunch398 20 дней назад

    Are there real world scenarios that need that many threads other than io for real time databases which should be run on entire hardware stacks specifically designed for databases?

  • @shayvt
    @shayvt Год назад +4

    C# Task is an abstraction using the threadpool. He should use the Thread class which instantiates a real thread.

    • @DarkOoze123
      @DarkOoze123 Год назад

      *managed thread

    • @LuaanTi
      @LuaanTi Год назад +2

      No, C# Task implies no threads whatsoever. It uses the thread pool by default for CPU work, yes, but that can easily be just the part of the job that says "this task is finished" (e.g. handling the async I/O response).
      Creating an explicit thread (_not_ a hardware thread, _not_ an OS thread - you don't have control over those natively in .NET) is something completely different, and very rarely used in modern C#. It negates the whole point of using asynchronous I/O in the first place, which is avoiding the overhead of threads that do nothing but wait for something to complete (whether that's a timer or a HTTP request). Which, let's not forget, was part of the point of the original article - showing how expensive "real" threads are, and that different approaches to handling asynchronous code have vastly different results.
      But that article is very flawed anyway. It would make sense to compare multi-threaded code with other ways of doing asynchronous I/O... but instead, we get an arbitrary choice of one or the other for each platform. You can have promises in any language. Many have commonly used or outright built-in APIs for that. Seeing the difference between, say, Java threads and Java Futures would be a bit illuminating, at least... though it still needs to be noted that you have a lot of control over things that absolutely crush this comparison anyway. The default stack size of a new thread on modern .NET is usually 1 MiB. Windows doesn't really allow you to go very small with thread stack sizes (you're supposed to use a few threads, not thousands). Linux is designed around multiple processes/threads using the same memory for as long as possible, so a thousand threads each with 1 MiB memory can actually occupy just a few megabytes (until you actually start to modify the memory).
      Every performance benchmarks needs to have a goal. This one doesn't really seem to have one, apart from a simplistic "weird that memory usage in async stuff can vary wildly"... I mean, pretty much every platform out there allows you to pre-allocate as much unused memory as you want, but it'd be a weird way to compare different platforms, right?

  • @joejazdzewski
    @joejazdzewski Год назад +4

    Prime will now worship at the altar of Anders (creator of C# and Typescript) /s

  • @TertiumAverruncus
    @TertiumAverruncus Месяц назад

    Virtual threads are backed by a fork join pool and basically uses asynchronous operations with event loops - but at the vm level. Sigh not even close to using that right since it’s dependant on the default fork join pool, which depends on your systems number of cpus etc

  • @thekwoka4707
    @thekwoka4707 Год назад +2

    Why were they using the newest rust from last month and nodejs from like 4 years ago? Like AWS doesn't support the version they used. Or 3 major verisons after it.

  • @kellybmackenzie
    @kellybmackenzie Год назад +4

    I would have loved to see Haskell tested like this, it'd be so good

    • @FinnBender
      @FinnBender Год назад +3

      It's surprisingly bad :(
      1 thread: 5.0 MB
      10 threads: 4.9 MB
      100 threads: 4.9 MB
      1k threads: 8.3 MB
      10k threads: 63.1 MB
      100k threads: 803.8 MB

    • @kellybmackenzie
      @kellybmackenzie Год назад

      @@FinnBender Aww man! Yeah, that makes sense, Haskell is infamous for its high memory consumption because of thunks and stuff like that. I'm surprised it's that bad for 100k though, damnnn!

  • @maxharmony6994
    @maxharmony6994 Год назад +5

    Now imagine giving Tom a C#

  • @TheSwissGabber
    @TheSwissGabber 8 месяцев назад

    in python there is asyncio, thread and multiprocessing. ordered according to their overhead. if you want to use multiple cores you need multiprocessing.

  • @PeterBernardin
    @PeterBernardin Год назад

    Also, is tokio actually multi-threading or is it just for made for non-blocking IO in one thread?

    • @thekwoka4707
      @thekwoka4707 Год назад +1

      Both.
      It makes multiple threads, and moves your processes between threads. That's why everything needs to implement Send, so it can be moved between threads safely.

  • @tedchirvasiu
    @tedchirvasiu Год назад +4

    Is this the first time in history he turned off the notifications before starting the video?

  • @ringishpil
    @ringishpil Год назад +24

    Go's minimum stack size is (I think) 4KB per Goroutine and it grows/shrinks as needed. Not sure whats the minimum stack size. Therefore the ~2GBs in Go is not surprising. So in 3GB of memory, you can put 1mil/10mil and probably even 20/30 million goroutines, they will just shrink in size. You can probably with the example from Piotr do even more, since it's a very simple non-memory consuming routines. But as I said, not sure whats the minimum stack size that will be consumed by a gorutine. But its less then 4KB for sure (in your example 2.8GB/1_000_000 = 2.8KB). My guess is that is not shrinking even less than this since there is enough memory available.
    Anyway you put it nicely, this is not a real world test, TCP/Websocket connection would be much better

    • @Rakstawr
      @Rakstawr 8 месяцев назад +1

      Go test here was completely misrepresented by non optimized garbage collection settings and not profiling how much of that was colored for deletion.

  • @Gennys
    @Gennys 4 месяца назад +1

    It looks as though c sharp is creating a thread pool by default instead of actually launching threads.

  • @memespdf
    @memespdf 11 месяцев назад +1

    Ironically, I think it would make sense to start all programs by allocating a static 1GB of memory and keeping it around at the end. This ensures that no preallocated memory can be used

  • @istovall2624
    @istovall2624 Год назад +4

    C# to the moon! Havent finished yet. Drum roll.

  • @urbanelemental3308
    @urbanelemental3308 Год назад +5

    Yeah, the C# example is not real threads. The code is just adding tasks to the scheduler, similar to "setTimeout" in JS. Which might be fine for most things, but each "Task" is taking up memory and then waiting to run. IMO, these tests are not good overall. I agree the Java one is probably not a good example wither with the synchronous join.

    • @metaltyphoon
      @metaltyphoon Год назад +5

      Dude… only the one of the Java and Rust was real threads. All other tasks have use a pool abstraction. I think Elixer uses actual process.

    • @zephyrprime
      @zephyrprime 25 дней назад

      Not full threads but not just tasks either. Tasks use a threadpool to manage execution and the .net runtime will decide how many threads are in that threadpool.

  • @ingenium1502
    @ingenium1502 Год назад

    Yes we would like to know about socket and tcp connection test. Thx for video😀

  • @sikor02
    @sikor02 9 месяцев назад +1

    If C# has memory available it will swallow a lot for optimizations. Once i experimented with docker and performance tested my simple api endpoint with Bombardier (tool written in GO) - bombarding it with thousands of requests. My app used 1.5 gig of ram (!). But then I started limiting my container's available memory (-m parameter), and guess what, I went down to 15 MB and still worked. GO equivalent required at least 16 megs to work. The C# API with so little memory available performed almost the same as when using 1.5 GB anyway. (The GO was like 2% faster though, not gonna lie)

  • @tecoberg
    @tecoberg Месяц назад +3

    Where is C++?

  • @alxizr
    @alxizr Год назад +3

    The nodejs example is off point. You need to choose worker threads for staying in line with all of the other examples.
    The same goes for the Python AsyncIO example.

  • @HotakaPeter
    @HotakaPeter 7 месяцев назад

    Elixir/Erlang have a lot of services running by default. These can be optimised in the Erlang boot script.

  • @siddharthabajpai4182
    @siddharthabajpai4182 9 дней назад

    found it somewhere ."Asyncio provides a single-threaded, non-blocking concurrency model in Python. " would it be a correct thing to use it in this benchmark