Double the Performance of your Dictionary in C#

Поделиться
HTML-код
  • Опубликовано: 26 сен 2024
  • Enroll to "Cloud Fundamentals: AWS Services for C# Developers" for FREE: bit.ly/3XKUBOH
    Become a Patreon and get source code access: / nickchapsas
    Hello everybody I'm Nick and in this video I will show you a pretty advanced technique that you can use to double the performance of your Dictionary in C# and .NET.
    Don't forget to comment, like and subscribe :)
    Social Media:
    Follow me on GitHub: bit.ly/ChapsasG...
    Follow me on Twitter: bit.ly/ChapsasT...
    Connect on LinkedIn: bit.ly/ChapsasL...
    Keep coding merch: keepcoding.shop
    #csharp #dotnet

Комментарии • 121

  • @ja_mcito
    @ja_mcito Год назад +100

    great, now it needs to be simplified with syntactic sugar

    • @volan4ik.
      @volan4ik. Год назад +26

      No, efficient code that uses anything related to "unsafe" or "marshal" must be complicated and mind-blowing to show off in front of C# newcomers how good you are. This is what we never have to change🙂

    • @orterves
      @orterves Год назад +9

      Yep, the very first thing I'd do is create extension methods.

    • @Bliss467
      @Bliss467 Год назад +7

      public static V GetOrPut(this Dictionary source, K key, Action valueGenerator) where K : notnull {}

    • @saniel2748
      @saniel2748 Год назад +1

      @@volan4ik. literally none of that is complicated but ok

    • @volan4ik.
      @volan4ik. Год назад

      @@saniel2748 literally it's not, but might look like it is to people not familiar with that mechanism

  • @haxi52
    @haxi52 Год назад +55

    Given readability and complexity I would only consider using the method in hot paths that really needed it. But good info to know.

  • @maurosampietro9900
    @maurosampietro9900 Год назад +16

    I value maintainability over everything else an probably won’t be using this, except for very demanding and performance sensitive loops

  • @welltypedwitch
    @welltypedwitch Год назад +5

    Haskell has the most elegant solution to this that I'm aware of. Haskell uses the function alter with a type like this (translated to C# syntax and mutable dictionaries)
    void alter(Dictionary dict, K key, Func f)
    Now, the argument that is passed to `f` is `None` iff the key was not found and `Some(theValueAtTheKey)` otherwise, and returning `None` from f will remove the binding from the map.
    There is also alterF that allows f to return in an arbitrary functor, but that doesn't translate to C# all that well.

  • @FranciscoDias-ne9nm
    @FranciscoDias-ne9nm Год назад +27

    I'm almost sure that last time I checked there was a "TryAdd" function on Dictionary, could it be used too?

    • @Sindrijo
      @Sindrijo Год назад +1

      Yes.

    • @brianviktor8212
      @brianviktor8212 Год назад +3

      True. Thinking about it, internally it could be optimized to call the dictionary only once. The only difference is that with a ref you can handle structs as values more efficiently, but to be fair, who uses structs as values in dictionaries anyway? It's rare, and it's especially rare to also make it more performant because it could be a performance bottleneck.

    • @Eternal--Sunshine
      @Eternal--Sunshine Год назад +17

      In fact, the TryAdd method, which directly calls the private TryInsert method, has the comment "NOTE: this method is mirrored in CollectionsMarshal.GetValueRefOrAddDefault below", so in this case TryAdd makes more sense for clarity

    • @alexisfibonacci
      @alexisfibonacci Год назад +7

      @@Eternal--Sunshine TryAdd takes a precomputed value. What if that value is expensive and you want to compute it only if you need to insert it into the dictionary? In which case a factory method/delegate parameter will come in handy. But that represents potential overhead for dictionary access? Just thinking...

    • @antonofka9018
      @antonofka9018 Год назад

      @@brianviktor8212 nobody does precisely because of the problems listed in the video. people have to go as far as creating custom crappy dictionary types because of this.

  • @tosunabi1664
    @tosunabi1664 Год назад +11

    Would've been nice if you've added benchmark for TryAdd method.

  • @matc83
    @matc83 Год назад +2

    Thanks! This helped me shave about 15% off the runtime of the pathfinding code for the game I'm working on!

  • @alexandernava9275
    @alexandernava9275 Год назад +4

    Multiple things can have the same hash, just as multiple passwords can have the same hash(part of brute forcing is you don't need to find the exact password, just one with the same hash). Just the number of collisions is low(though does go up quickly, like the birthday problem).

  • @rick2591
    @rick2591 Год назад +5

    I would like to see a performance comparison where the dictionaries are operating in a multi threaded environment and we need to make sure that this code executes in a critical section.

  • @Andrew90046zero
    @Andrew90046zero Год назад +10

    So I use Unity Engine, and I don't think we have access to these functions unfortunately, because Unity is stuck on an older .NET version. But do you think there are some internal functions that exist in the older .NET versions that I can do some reflection on to get access to?
    Btw, I really enjoy watching these videos where you show hidden gems of devious things you can do in C# It gets me excited!

    • @christoph6055
      @christoph6055 Год назад +1

      You'd have to check out if Mono has some, since those might be implementation details and aren't standardized.
      Unity might switch to .NET 6 soon ish, though I sadly don't know when

  • @rreiter
    @rreiter Год назад +2

    ref var x = ref blahblahblah f(var out y, struct butOnlyOnATuesday). LOL, now take that and syntactic-sugar-ize it. Been coding for 40+ years and I can't help but feel that over time all the things that made C# nice and readable are being obfuscated and re-obfuscated and soon it may read again like (my beloved but almost incomprehensible) C ! Having said that, thanks for the video, good one!

  • @revilorere
    @revilorere Год назад

    I really appreciate all of your videos!

  • @Dustyy01
    @Dustyy01 Год назад +19

    Really nice, but curious what would happen if the dictionary changes during the lookup and setting? Undefined behavior?

    • @antonofka9018
      @antonofka9018 Год назад +1

      potentially a seg fault ig

    • @Mortizul
      @Mortizul Год назад +2

      It's a collection changed exception.

    • @MrHacksTutos
      @MrHacksTutos Год назад

      We can't use lock to this scenario?

  • @x3mbonus
    @x3mbonus Год назад +1

    Why do we need use ContainsKey before assigning or using Dictionary value?
    We can assign directly
    var x = new Dictionary();
    x[123] = "abc";
    x[123] = "def";
    It will add or replace value

    • @nickchapsas
      @nickchapsas  Год назад +2

      Because I don’t want to add the value if it already exists. It is a valid usecase in many scenarios. Consider that a player in an online game needs to join the queue to find a match to play. If they don’t exist then we add them in the dictionary. If they do exist we show the "you are already in the queue" message

    • @keksoid4
      @keksoid4 Год назад

      @@nickchapsas Yep, but in your case it's not performance matter already. In case of reference type object to be added into dictionary it depends on:
      1) If object exists(created) when checking is performed
      2) Actual object supposed to be stored in dictionary should always be the same(hashcode evaluated on object fields), so there is no reason to be afraid of replacing the same instance(only if it should be created before add operation(see p1))
      3)If you only need to inform caller that object already exists, it's better to use two methods: one for checking, one for adding/replacing and use for apropriate for your scenario
      It's kind a tricky question you've raised and in case of reference types I would not recomend using those approach. But in case of structs where unobvious copy operations exist, it's possible, but only if properly tested and measured

    • @kousikpandey3401
      @kousikpandey3401 4 дня назад

      @@keksoid4 I think these methods would be of help when considered for multi-level dictionaries.

  • @BillyBraga
    @BillyBraga Год назад +4

    7:45 In between the time GetValueRefOrAddDefault is called and the time valOrNew is updated, does it mean some other piece of code could get the entry from the dictionary and get a null value?

    • @cn-ml
      @cn-ml Год назад +2

      Yes, i imagine so. The dictionary is not locked in any way, and the marshal call only inserts a default. Would be "fixed" with a factory method instead of default

  • @metaltyphoon
    @metaltyphoon Год назад +3

    There is a TryAdd method on dictionary now. Idk if this is necessary .

  • @arztje
    @arztje Год назад +4

    It's a cool idea, but I can't condone using unsafe code. I am sure there are good use cases for it - but you need to weigh the risk before considering a path like this.

    • @kylekeenan3485
      @kylekeenan3485 Год назад

      You are right, but Nick is here to push our boundaries.

  • @cn-ml
    @cn-ml Год назад +2

    Why do they not handle it like the concurrent dict with a getoradd function or maybe with a factory method instead of default

    • @antonofka9018
      @antonofka9018 Год назад +1

      because it would need to allocate the delegate

    • @cn-ml
      @cn-ml Год назад

      @@antonofka9018 right, i forgot about that.

  • @Variapolis
    @Variapolis Год назад +1

    9 minutes ago. What a great start to my afternoon.

  • @Andrew90046zero
    @Andrew90046zero Год назад

    "Dictionary" and "performance" in the title??? SOLD

  • @kvloover
    @kvloover Год назад

    was close to trying this out today already, but... can't use refs in async code
    (well I guess it'd be running into the formentioned issues when updating the dictionary potentially as well)
    but still sad :D

  • @krzysztofhandzlik9273
    @krzysztofhandzlik9273 Год назад +2

    It's awesome :) , but i hope i will never find such tricky and magical code in my daily work. Even chatgpt won't help me understand what the heck is going on here XD.

  • @ocnah
    @ocnah Год назад +1

    So how would one guarantee, for instance in an web api, that the dictionary value won't change during the lookup, without lowering performance?

    • @davinki3728
      @davinki3728 Год назад

      IReadOnlyDictionary can guarantee no modifications, or you can create your own wrapper that locks all operations on a dictionary so only one thing can be done to it at a time. ConcurrentDictionary also has a GetOrAdd method that does what is shown in the video

    • @dextergandini
      @dextergandini Год назад

      Locks, single thread, changes only during initialization, etc. Either way it doesn't look pretty

    • @ocnah
      @ocnah Год назад

      Aren't locks affecting performance, potentially cancelling the gain that was created by this way of setting the value?

  • @astralpowers
    @astralpowers Год назад +2

    Why not add this as an extension method similar to the ConcurrentDictionary.GetOrAdd(TKey key, Func valueFactory) ?

    • @protox4
      @protox4 Год назад

      Because that delegate slows it down. I could see that being useful if C# eventually adds value delegates, though.

    • @astralpowers
      @astralpowers Год назад

      @@protox4 the best solution would be if the JIT can inline the delegate

  • @ckhinsight
    @ckhinsight Год назад

    I wanted to know what happened to you after the ads? Why you changed your hoodie?

  • @ItsTheMojo
    @ItsTheMojo Год назад

    The struct example seems more a warning about misusing structs than anything else.

  • @shahab-the-guy
    @shahab-the-guy Год назад +1

    There is an issue in the sample for the struct scenarios with updating the ID of the value inside the dictionary the Key referring to that is not really updated, the Key is still referring to the old value while the id of the instance of the struct is updated. could result in bugs at some scenarios ruclips.net/video/rygIP5sh00M/видео.html

    • @Erlisch1337
      @Erlisch1337 Год назад +1

      11:33
      simplified your link :)
      (YT makes time comments into clickable timestamps)

  • @eriknyk2k
    @eriknyk2k Год назад

    Hi Nick, I follow and liked many of your videos, dumb question do you will ever considered make videos about Golang or Java? :p

    • @nickchapsas
      @nickchapsas  Год назад +12

      Stay tuned for the first of April. There will be a Java video going up 😂

    • @FiggyCricket
      @FiggyCricket Год назад +1

      🤣

    • @eriknyk2k
      @eriknyk2k Год назад

      @@nickchapsas awesome!

  • @watherby29
    @watherby29 Год назад +1

    Hashmap! Haaaashmappp! => hired

  • @Animal-yb1rr
    @Animal-yb1rr Год назад +10

    I am making big animal type dictionary with different species and their attributes

  • @urbanguest
    @urbanguest Год назад +1

    Man, where were you before I retired? Your videos are the best and excellent!!!

  • @aronsz
    @aronsz Год назад

    How are you able to see into the Dictionary's set definition? (3:29) Is this something specific to this IDE or can it be done in Visual Studio also?

    • @romka2009
      @romka2009 Год назад

      This is one of features of ReSharper or Rider IDE (in his case) :)

    • @keyser456
      @keyser456 11 месяцев назад

      Pretty sure this has worked in Visual Studio as well since ... forever. There's a "Locals" window while debugging, or the inline code has context sensitive drop-downs (like you're seeing in this video), or you can probe stuff live with code including Intellisense from the "Immediate" window. I just switched to Rider about a month ago and it's very impressive, but a lot of this stuff can also be found in VS (including Community version).

    • @idk-jb7lx
      @idk-jb7lx 4 месяца назад

      @@keyser456 to be fair it doesnt always work well in VS, which is probably what the OP is referring to. for example many times for the stdlib it just shows you the method's "signatures" and not the body for some reason.

  • @thenoober3321
    @thenoober3321 Год назад

    Could we lock the dictionary and run this code then unlock it?

  • @true_xander
    @true_xander Год назад

    And what about the following? (dict[key]

  • @weluvmusicz
    @weluvmusicz Год назад +1

    Does it work in an async context?

    • @alexisfibonacci
      @alexisfibonacci Год назад

      Yes, as long as no other code modifies the dictionary before you complete whatever operation you are trying to carry out.

    • @idk-jb7lx
      @idk-jb7lx 4 месяца назад

      @@alexisfibonacci ref locals are forbidden in async methods because they're actually fields, no?

  • @BadgersEscape
    @BadgersEscape Год назад +2

    The "GetOrNullRef" example you made was really awkward.... in the end you have an item that is saved with an incorrect key.
    Just think you should have mentioned that one should not do that specifically.

  • @codingbloke
    @codingbloke Год назад +11

    At time index 7:22 a good example of why (IMO) `var` is undesirable. This is code I've never seen before and my comprehension is compromised a little by the lack of type information in the code. Especially in a team of developers where code gets included in the product via pull requests after review, the use of `var` hampers the review process even more because code is reviewed by Web based before/after UIs that don't let you hover over the `var` to tell you what the type is (and even having to do that is worse than simply using `bool` in the place of `var` for example).

    • @brianviktor8212
      @brianviktor8212 Год назад +4

      True. I personally almost never use var. The only way to make it workable in Visual Studio is to activate some option which shows the actual type of vars right next to it. But as you said, outside of VS the problem is still there and makes life harder.

    • @ItsTheMojo
      @ItsTheMojo Год назад +2

      var was only intended for cases where the type is obvious. Unfortunately, most developers I've worked with decided to use it everywhere. Ironically, those same developers often complain about the difficulties they encounter when working with variables in languages like JavaScript or python because they're dynamically typed and so variable declarations don't give type information.

    • @anderskehlet4196
      @anderskehlet4196 Год назад

      @@ItsTheMojo - In JS you're forced to always keep track of the types of everything, because otherwise you'll end up with some bad bugs. In C# the compiler does the work for you. Having the machine deal with it lessens the cognitive load of the developer and is much less error prone. There's a reason languages like TS, Python3 and PHP have moved towards more static analysis.
      Personal opinion: Types are for the compiler to ensure correctness. If they can be inferred, all the better.
      See also every FP language.

    • @nikolaslijepcevic
      @nikolaslijepcevic Год назад

      @@brianviktor8212 Code is much cleaner if you use simply use var keyword. Especially if you class has mile long names and it has Generics in Generics classes or collections. With ctrl + left mouse or mouse over you can easily see what type represents the return method or property.

    • @ItsTheMojo
      @ItsTheMojo Год назад

      @@anderskehlet4196 and using var when the type is not obvious forces the reader to keep track of types too.

  • @hugodufort
    @hugodufort Год назад

    I did not get why you are using the "ref" keyword?

    • @billy65bob
      @billy65bob Год назад +3

      The CollectionMarshall methods returns the store itself, so you use the ref keyword to interact directly with it.
      The specific method Nick uses puts a properly keyed and very unsafe null into the dictionary, the ref keyword allows you to replace this null with the actual value.
      You can also do some serious damage with it; it's borderline unsafe and fast for a reason.

    • @francoiscoupal7057
      @francoiscoupal7057 Год назад +2

      Short version: this "forces" the compiler to use the variable as a pure memory pointer, even if it MAY point to a value/struct type, or even a "null" value of said type.
      It's a memory address, not a variable reference. This code behaves like C++ pointers, where you can store in a variable either the value (val) or the reference/address (ref) to that value.
      Basically:
      int a = 5; // Declares memory space and assign value.
      int *b = &a; // Stores the pointer value to another variable (& fetches the memory location of the variable, * indicates a pointer variable.)
      *b = 6; // Stores a NEW value directly to the memory location, by dereferencing the address b points to)
      printf(*b); // Would print the value AT position b by dereferencing, hence would show "6"
      printf(b); // Would print the integer value of the memory position, which would most likely NOT be "6".
      Using this method in C# (reference pointers) you can manipulate the value of BOTH reference types (classes) or value types (structs and values) via pointers. Hence, this is *unsafe* because you override the usual C# logic for referencing values, and directly tamper with memory locations: something the language was made to prevent for usability in most contexts.

    • @hugodufort
      @hugodufort Год назад

      @@francoiscoupal7057 Merci Francois 🙂

  • @johncassidy3071
    @johncassidy3071 4 месяца назад

    So what you're saying is that c# has finally caught up to Perl. With much uglier syntax.

  • @MrVarsium
    @MrVarsium Год назад

    time to test tryadd() :p

  • @GameDevNerd
    @GameDevNerd Год назад +1

    *Ad* ::
    "Do you want to use C# in the cloud and have muscles and hot babes like Nick Chapsas?!"
    *Me* :: _panic buys all courses_

  • @swordblaster2596
    @swordblaster2596 Год назад

    interesting again, but I'm never going to use it or allow it to be used!

  • @scott98390
    @scott98390 Год назад

    WHY WHY WHY does your website not use Google as an authentication provider? Teachable? Really?

  • @evgenxpp
    @evgenxpp Год назад

    I don't like such syntax sugar personally because there is no significant benefit. Just another way to make common things...

  • @urzaaaaa
    @urzaaaaa Год назад

    Interesting, but no thanks, I'll pass and take the slower code that will not blow in my (or my coworkers) face.

  • @bobfish7699
    @bobfish7699 Год назад

    The syntax is awful. Really should be added directly into..

    • @FrankShearar
      @FrankShearar Год назад +2

      Given what this does, and how it has massive footgun potential, that awful syntax might be deliberate. (See Scala's choice of "isInstanceOf".) If you really need it, you can put up with the ugliness. And if you don't... the ugliness reminds you that you likely shouldn't be using this technique :)

  • @mikkolukas
    @mikkolukas Год назад

    2X FASTER is not the same as double performance. It it the same as triple performance.
    2X AS FAST is the same as double performance.

  • @jeroen7362
    @jeroen7362 Год назад

    Again the lowlevel stuff, even using !Unsafe i am so glad i do not even know what it does. Read the word UNSAFE that you should not use if you want to create readable and maintable code. if you are really worried about the penalty of containskey maybe assembler is your better language of choice to work with.

  • @maciekwar
    @maciekwar Год назад +1

    Its a click bait :-) my honest opinion :-)

    • @nickchapsas
      @nickchapsas  Год назад

      It's not :-) numbers don't lie :-)

    • @maciekwar
      @maciekwar Год назад +1

      @@nickchapsas it is:) its one edge case that 95% will not experience and even if they do, I'm sure they have more serious problems in their code :) dont get me wrong its useful information but the thumbnail is a click bait :)

    • @nickchapsas
      @nickchapsas  Год назад

      @@maciekwar It's not :) Just because the usecase is limited it doesn't make the video clickbait. I am showing you a way to double your performance. Whether you can use that in your very specific usecase or not is irrelevant. :)

    • @maciekwar
      @maciekwar Год назад +1

      @@nickchapsas you have 4 scenarios in 75% of your scenarios difference is NOT 2x faster :)

  • @MaxiTB
    @MaxiTB Год назад

    Yeah, using struct which are bigger than 32 Bytes is really, really bad. You should never do that 🙂

  • @anthonysteinerv
    @anthonysteinerv Год назад +3

    While I appreciate this type of videos, but tbh it's been a long time since I think these are useful, maybe you can stop with the obsession of posting performance things for 1 ms change, or no 100kb of memory used, please do something else, I know your actual useful videos are on a paywall in a course type of content, but maybe posting an introduction to those would actuall help people start, and maybe generate more sells for those videos.

    • @nickchapsas
      @nickchapsas  Год назад +1

      I’m not looking to sell more courses. I’m looking to showcase things that I find cool to a wider audience. If I wanted to sell more courses I would make freemium videos that show 5% of a topic and then tell you that to get the rest you have to buy a course

  • @phantompooper
    @phantompooper Год назад +1

    The madman uploading videos in the middle of NDC

  • @DanGolick
    @DanGolick Год назад

    Unsafe indeed. You changed the ID of the struct when using GetNullOrRef. So now the hash points to an item that doesn't have that hash

  • @Rolandtheking
    @Rolandtheking Год назад +6

    Nice video, i've encountered the double lookup as well, and could of used this code several times before. Does the unsafe class mean i need to compile unsafe btw?

  • @riplee406
    @riplee406 Год назад +11

    Thanks for sharing this. I will continue with the old way. Why? It's easier and makes more "readable" sense to junior developers who will be supporting my code.

  • @MCmotorsports12
    @MCmotorsports12 Год назад +1

    Isn't this just the same thing as a the built in ConcurrentDictionary?

    • @alexisfibonacci
      @alexisfibonacci Год назад

      That is built specifically for concurrent/multi-threaded access. This scenario does not imply that you will be looking to cut out the double hash computations which on a tight loop save CPU time and allocations.