“Turn All Your Enums Into Bytes Now!” | Code Cop

Поделиться
HTML-код
  • Опубликовано: 22 янв 2025

Комментарии • 341

  • @crystalferrai
    @crystalferrai 10 месяцев назад +126

    There is a compiler code analysis warning for this situation that, if enabled, will trigger if you set an enum to anything other than int.
    "CA1028: Enum storage should be Int32"
    This warning is not enabled by default, but I imagine there is some reason it exists. Microsoft says this about it:
    "Even though you can change this underlying type, it is not necessary or recommended for most scenarios. No significant performance gain is achieved by using a data type that is smaller than Int32."

    • @jamescanady8156
      @jamescanady8156 10 месяцев назад +6

      Right, when you’re dealing with a billion + rows, these small optimizations are critical.
      But as it said in the first line of the content, it depends on your usage.

    • @Alguem387
      @Alguem387 10 месяцев назад +1

      Honestly that a very anal and stupid warning
      There are cases where you need specific integers on the top of my head is
      In Protocols
      When the padding matters

  • @DmitryKandiner
    @DmitryKandiner 10 месяцев назад +98

    Actually, I had a couple of times specified a non-default base type for enums. But in all those cases the actual data was transferred as a binary stream via a serial connection to an embedded device, so I had to carefully follow the protocol.

    • @paulkoopmans4620
      @paulkoopmans4620 10 месяцев назад +7

      Agreed! That is probably the only viable reason though. For 1 Million records in a table this supposedly "optimization" saves 2.86MB. On top of that it might also do more harm then good as the IO could be negatively impacted.
      The "view" of the OP is also for no flag enums. The default of an int is a good default for flag enums as anything lower is only giving you 8 or 16 flags. The win32 API is full of int and long flag enums.
      I've done some PLC stuff in the past too and the PLC was even putting bit flags together in a 16 or 32 bit registers, and the only time I have had to step away from the default.

    • @peroyhav
      @peroyhav 10 месяцев назад

      ​@paulkoopmans4620 for 1-2M records an hour, that's actually about 3 GB every month.

  • @dcuccia
    @dcuccia 10 месяцев назад +20

    Context indeed matters. I used a byte enum yesterday to reduce the packet size of a serial messaging protocol.

  • @cheezyskipper
    @cheezyskipper 10 месяцев назад +18

    I swear with every Code Cop video Nick becomes more and more insane 😂

  • @vchap01
    @vchap01 10 месяцев назад +54

    It is a micro optimization for most databases. But you can have tables with tens or hundreds millions of records where some of the columns are enum values. It can make indexes perform better by being smaller. This tip might not be very relevant to many applications but it is not wrong.

    • @josephmoreno9733
      @josephmoreno9733 10 месяцев назад +3

      Totally agree.

    • @7th_CAV_Trooper
      @7th_CAV_Trooper 10 месяцев назад +4

      Care to explain how a btree with byte values performs better than int values?

    • @simon3121
      @simon3121 10 месяцев назад +9

      @@7th_CAV_Troopernumber of page reads (from disk) is the magical counter.

    • @davidmartensson273
      @davidmartensson273 10 месяцев назад +7

      Its wrong because while you definitly can use tiny in the DB, there is no measurable cost in casting that to and from an int in the actual code.
      And as others and me have mentioned in other comments, using a byte enum might actually cause performance degradation.
      So keep that optimization in the DB.

    • @AEF23C20
      @AEF23C20 10 месяцев назад

      most sane databases use data alignment, and alignment is not byte, at least int
      processor dsnt know what a byte is, but processor takes int into register without any problems and overheads
      and in sane databases no enums exist, they are not needed

  • @Ziirf
    @Ziirf 10 месяцев назад +7

    We cast enums to bytes at our work, because the database is very old (25+ years), and we have some foreign keys to constant types, where the foreign key were of type [tiny] and now that we integrated EF core, we instead of making a join on the table that stores the value name, we just use enums of type [byte] - this just makes the EF to SQL relation easier.
    I don't think it neither a good or bad advice, it is just a very situational thing. which you normally wouldn't have to worry about.

  • @FrancisGauthier2
    @FrancisGauthier2 10 месяцев назад +7

    I used to work on a transactionnal database that was dealing primary keys, enums with bigints, bigints everywhere! I can tell you that optimizing storage can lead to significant performances as well lowering storage for back ups and data warehouses. As for enums, I always use default integers, but if we do have millions of millions of records, i may consider using other types depending on the range of values or reconsider using a different design approach (using discriminators or not i.e table per types etc). Storage cost are very low nowadays anyways.

  • @TomasJansson
    @TomasJansson 10 месяцев назад +3

    Just to be clear, 100% with you :). There are times of course where you should think of the types in the db. A concrete example from my previous job where we went from storing guids in strings (BigQuery didn’t have a native format for guids at the time) to its byte representation saved us a lot of storage space, and since we ingested about 10TB of json every day that optimization actually saves us quite a lot of money.

  • @GameDevNerd
    @GameDevNerd 10 месяцев назад +2

    I think most people struggle to understand that optimization or "savings" have a factor/scale to them, and you can look at how effective/valuable a certain optimization or trick is by examining how it's going to _scale out_ in the context some real or hypothetical project. A "one-off" optimization or memory saving isn't helping you much, at all, unless it's a VERY big, single thing we are talking about, like maybe optimizing install/storage size of a game or app by eliminating the need for some huge chunk of data/content. Changing the underlying value type of an enum to `byte` isn't doing much for you in and of itself. It *can* in some very specific situations, like I sometimes do this in real-time 3D/game code when I'm dealing with really big data buffers or I have to have some very specific structure layout or byte alignment. But just universally making all your enums into bytes is more likely to slightly degrade performance than enhance it, as you can be causing some misaligned byte boundaries or reducing the compiler's ability to help you, not to mention you can create some future technical debt for yourself in some situations ...

  • @NurchOK
    @NurchOK 10 месяцев назад +2

    Optimize is a very broad topic. Using byte as the underlying type will, of course, save space, but is it worth it? A 32-bit (or 64-bit) processor is much more speed efficient to access 32 (or 64) bits of data at a time. The data, however, needs to be 4 (or 8) byte aligned. For a modern processor to read a byte, it needs to read 32/64 bits of data and strip the unnecessary bits. If the data is not aligned, the process is even slower because shifting happens. We don't see it in a high level language, but that happens at assembly level/machine code.
    The only time I use byte as the underlying type for enums is when I have data interchange happening, when a byte is part of a data structure sent by an embedded device.
    Something like that.

  • @JGoodwin
    @JGoodwin 10 месяцев назад +5

    There's nothing wrong with saving 3 bytes on a column in SQL Server. Odds are you have more problems than this, but that doesn't inherently make it a problem to seek out the savings. As far as how that should translate to the C# representations, I think there's room to debate.
    Remember that each field isn't stored just once. It's also stored in each index. It also consumes server memory. The less you use to do the same work, the more simultaneous clients you can support. The benefit from small changes like this isn't massive, but they add up for larger datasets.
    Agree with Nick about prioritizing your performance scrutiny.

  • @PajakTheBlind
    @PajakTheBlind 10 месяцев назад +11

    For one reason or another I had to write a parser/serializer for some obscure format (existing one could not handle memory requirements we had - would hog up ALL THE MEMORY on a machine).
    The sample code required maintaining ordering of a list when inserting new stuff (so some stupid search through the list and inserting at proper index)... which in my case ended up being an absolute performance killer. After discovering that through profiling and replacing list with sorted list (or something like that), performance improved about 100x in relevant cases.
    In other words, do a PoC, profile it and look for things that eat up most of your performance. This way it will be less labour intensive and your time is more valuable than adding extra disk/ram to that server that is supposed to run the damn thing

  • @JohnMcLaren-s2h
    @JohnMcLaren-s2h 10 месяцев назад +1

    Something I haven't seen mentioned is message structs. For one of my work projects we had a messaging system that was marshalling message structs and sending them over legacy hardware with very low bandwidth. In some structs, the fields were marshalled as int32 so when message frequency was high it would start slowing down. These fields were used in combination to represent some state. We found a way to optimize and drastically reduce the size by representing each state as a single byte under one enum and use bitwise operations to combine the states together using bit masking.

  • @Chris-zb5nm
    @Chris-zb5nm 10 месяцев назад +5

    This enum byte thing exists like 15 years.
    The fact that people react to this like it's something new makes me think on the quality of any software these devs create every day!

  • @Xankill3r
    @Xankill3r 10 месяцев назад +18

    We have a few Enums in our game code that are set to bytes and shorts but that's specifically because we do have hundreds-of-thousands to millions of them in contiguous arrays for the game's data so we do gain a fair bit of performance doing this. Especially from a cache point of view.

    • @Sindrijo
      @Sindrijo 10 месяцев назад +2

      Do you combine eight booleans into one?

    • @Xankill3r
      @Xankill3r 10 месяцев назад

      @@Sindrijowe don't really have a lot of places where we would need to do that. Except maybe one instance where we pack groups of six bools into single bytes - wasting 2 bits yes but it works out in that situation.

    • @Revin2402
      @Revin2402 9 месяцев назад

      Performance? Probably not. You rather pack more information into your bytes, but packing and unpacking is usually less performent then not doing it. We were doing it in multiplayer games to reduce needed bandwidth. But it was also rather unusual practice. You can find same principle in c++ code in unreal engine a lot.

    • @Xankill3r
      @Xankill3r 9 месяцев назад

      @@Revin2402 we have benchmarked it (as one should) and it actually does improve performance in our case. I'm guessing it is due to reduced cache misses.

    • @psharpnet
      @psharpnet Месяц назад +1

      @@Sindrijo Those are called "flags". And yes we did those things back in the day ... kind of 30 yrs ago. Fast-forward to today, I can think of VERY few cases where doing this type of thing could be called 'reasonable' and all those cases imply using tiny devices vith very little memory and storage capacity. It has NO sense to do that in ANY other case. As Nick said, not even in those 'multi-billion-row' tables. Because you can surely bet you're having WAY worse problems than saving 3 bytes (or 6, or 9) on each row, starting with the very fact that you have a SINGLE table with gazillions of rows stored on it.

  • @darkherumor
    @darkherumor 10 месяцев назад +1

    03:32 "Why does it bother me so much" You've no idea how much I relate to this everyday :D

  • @jakubbalarin6867
    @jakubbalarin6867 10 месяцев назад +3

    Size column at database matter if column is part of index. If key has size 8 or 5byte is big difference because more keys can be contained in one 4KB db page.

  • @fatihgenc7385
    @fatihgenc7385 10 месяцев назад +15

    To be honest I do it too, when I create an Enum to be saved in database I give byte type, because it just a keyword and conversion in dbcontext and nothing more, easy to implement. Yes most probably projects I worked on had missing better performance optimization but it does not matter, if I am aware of something while implementing, I will do it. It does not take a lot of effort.

    • @7th_CAV_Trooper
      @7th_CAV_Trooper 10 месяцев назад +1

      What is it you think you're optimizing? Database size?

    • @fatihgenc7385
      @fatihgenc7385 10 месяцев назад

      @@7th_CAV_Trooper yes, I work in a project where we get almost a hundred millions of messages daily and we have large databases for each env, we clear the data periodically but they are still big.

  • @MajeureX
    @MajeureX 10 месяцев назад +1

    In general, it's good advice to follow the conventions of whatever programming language or environment you're using, even if on the surface the convention may seem counter-intuitive. The convention for always using the `int` type in C# is a prime example. It may seem intuitive to use the smallest integer type to hold range of values needed, but the wisdom of the crowd knows something that isn't obvious, namely modern processors are optimized process data values on 32-bit and 64-bit boundaries. So you may think your being smart and efficient, but in reality there's no benefit.
    Sometimes there are valid exceptions to the common conventions in specific circumstances. A skilled develop knows when these exceptions are needed. A skilled developer also knows how to performance test code and search for specific and measurable optimizations when performance improvements are clearly needed.

  • @peroyhav
    @peroyhav 10 месяцев назад +2

    We have a database (not SQL) where we store terrabytes of data. Even then, optimization into byte enums is not the best way to optimize the storage. We do gave some byte enums, but that's special cases that's transmitted binary over the network in high traffic paths. In a couple of edge cases I've even had to merge multiple enums into a single byte for serialization to save bandwidth and egress costs.

  • @MZZenyl
    @MZZenyl 10 месяцев назад +2

    I've only used byte-extending enums once; to reduce the size of a struct, which might be created 100,000-200,000 times per second, every second. With byte-extending enums, the struct is exactly 16 bytes in size, which also aligns with its "natural" packing size.
    And even then, I doubt using int-extending enums would actually result in any actual performance degradation of note.

  • @Reellron
    @Reellron 10 месяцев назад +4

    I've done this and I will probably do it again, but I'd never post it as a general advice, since it's much more likely to a headache while programming than it is to affect performance. For storing enums in a database, I'd use byte (tinyint) atleast 95/100 times though, since it'll be a primary key and I have some autistic traits.

  • @andywest5773
    @andywest5773 10 месяцев назад

    The convention for naming variables is to use multiple characters, which may not be the most efficient option. Most classes only contain a few variables. Using names longer than a single character wastes space, as each character uses 16 bits. Single-character variable names allow for more efficient storage.

  • @SteffenSkov
    @SteffenSkov 10 месяцев назад +2

    I think you are right.
    If I had to come up with an example of where enum as byte maybe could have value as an optimization, a game like Minecraft comes to mind. Imagine that almost all properties of each block could be expressed as byte-sized enums, then I guess it could be viable - for instance if you find that the game needs to swap out memory so often that it becomes a problem, if you could "magically" reduce the memory footprint by, say, half that might solve the issue.
    But again, this would be a very special situation, and not something for everyone (or anyone) to do by default.

  • @SnowImp
    @SnowImp 10 месяцев назад +3

    When I was told about this video my instinctual answer was "with such cheap storage available, why waste your time" and my second reaction was "database normalisation is better than worrying about enum types". Maybe that reaction is 'cos I'm a database programmer at heart. Hear, hear on identifying what is a failure in understanding, not a tip.

  • @jameshancock
    @jameshancock 10 месяцев назад +1

    It’s actually slower in the runtime to do that because the processor has to take the value and align it to the bitness of the processor so it is slower and gets expanded on the processor to the full bitness which uses the same memory anyhow.

  • @IllidanS4
    @IllidanS4 10 месяцев назад

    The general advice is something along the lines of "Prefer what is natural, not what is the smallest." With enums, you have to remember that they are just named integers ‒ by default there are 4 millions other valid values in addition to the 10 you enumerate, but why bother getting rid of them? If there is a *natural* reason to use a different underlying type, sure, but in that case all types are equivalent; it does not matter that one is the smallest. Sure, if you need to match a particular binary format, in a file, communications protocol or similar, then yes, a byte could indeed be natural, but not because it is the smallest, but because that is what the value actually is!
    Another take: why stop at byte? Make it tightly bit-packed! And if there is not a power of two number of values, use fractional bits (yes, that is "possible" too)!

  • @dire_prism
    @dire_prism 10 месяцев назад

    I do this in very rare cases for gamedev. I'm also sure that you're right you could find greater savings than that in my code...
    You didn't even have to get in to how absolutely zero bits are saved unless the byte enums are used along with other small types inside another type. Otherwise they will be word aligned anyway and the 'saved' bits will be unused.

  • @rmcgraw7943
    @rmcgraw7943 4 месяца назад

    Nick, in this case, I would have to side with the idea of forcing your enums to inherit from a type that only supports the precision needed. I think this author’s suggestion is made after he spoke to someone smarter than himself, who didnt explain WHY you MIGHT want to do this. The size of databases matter, not so much because of their storage requirements, but when it comes to indexes and searching that data; the more binary data you have to search, the longer it takes. The larger the backing field of an index is, the larger the DB index and the more processinng it takes to create and maintain that DB indexes. So, from a DB point of view, this is good.
    From a C# point of view, the size is a negligible concern. System.Enum is a class, a reference type, but enumeration values are stored on the stack. Depending on how it is used, and where it is declared, it could be stored on the heap or the stack, which means incurring the potential cost of allocation and garbage collection, but this is no different than most all other variable declaration of types in C#.
    Another C# concern is byte types of enums are generally used for enums that are flags (FlagAttribute), so it could confuse junior devs, unless comments are there to explain.
    So, overall, I would say, yes, do mitigate data storage requirements for your database, especially for the sake of your SQL Server databases, but as Nick references, there are databases that have built in optimizations for Enums. In both cases, it’s not going to make a very big impact on your application, especially since most modern applications can scale out.
    In general, pick types, for both data bases and programming code, that support only the scale and precision you need; do NOT create decimal values for integers, and DO use unsigned integers (note: not supported by the CTS I think) for non-negative integers, etc. But, dont rewrite an existing codebase for trivial optimizations; simply learn and write better code as you move forward in your coding career.
    This suggested optimization is valid, but I would probably put in the file section under “neat and good to know”, but not gonna save your poorly architectured systems.
    In C# 1.0b, I was doing this, but not for the sake of my DB, but for the sake of casting to and from those enums and base types. Back then, we didnt have all the Enum static methods we do today.

  • @MaQy
    @MaQy 10 месяцев назад +21

    It's useful for indexes, where size indeed matters a lot. I would not disregard this advice in all circumstances. Now, doing it by default, probably not necessary.

  • @laurentallenguerard
    @laurentallenguerard 10 месяцев назад +1

    I used to port mobile games in 2005 when we had 2 Mb of RAM. We had only one class to avoid wasting class definition bytes, everything was so optimized. Glad we now have reach petaflops with H100 GPU!

  • @tunawithmayo
    @tunawithmayo 10 месяцев назад

    I do use this, but only when the value maps to DB column that is a tiny int. It becomes a sanity check that helps avoid declaring a value that can't be stored in the database column. ( I also think I used it once for binary serialization, but that is very uncommon these days since most things serialize to JSON or XML). If the value only ever exists in application memory, I am with you, just let it be regular int.

  • @HamishArb
    @HamishArb 10 месяцев назад

    I also use typed enums, but these are my scenarios where I use them
    - sending/receiving it as raw bytes (which also requires care around endianness),
    - for interop where I match the native size (if I can be bothered to even deal with the issues caused by using the wrong type)
    - changing it to long/ulong for a bitfield
    - and very occasionally I change it to unsigned if I'm expecting to do calculations on it

  • @Albileon
    @Albileon 10 месяцев назад +5

    I mean... the only place we'd potentially think of doing this - is in our online game which is tick based in a lockstep model: because we try to keep data usage as low as possible. And probably we wouldn't even go as far, because there's more gains elsewhere to be made - but it also doesn't do any harm... I guess? So we'd probably do that at some point, but we haven't done much optimizations yet to begin with and are currently at about 6 mb's per hour per player, which is pretty good already! But if this is the kind of optimization you need, sure. All the bits help... but 99.9% of all use cases this is not necessary.
    But I would also say it's not a bad thing to do, if you're sure it's the right data structure. But you shouldn't do it for the performance reason, but because it's the correct data type.

    • @NickSteffen
      @NickSteffen 10 месяцев назад

      Yea, I can see maybe doing it for something like that or maybe where you are trying to fit a lot of things in a single network packet. (Or even cram something into specialized headers) even then though I would probably just transform it when creating the packet and leave the enum be in code. Which is the same thing that should be done for sql as well.

  •  10 месяцев назад +22

    I've only ever used the underlying type change for P/Invoke specifically to avoid having to cast. I can't think of any other reason

    • @user-tk2jy8xr8b
      @user-tk2jy8xr8b 10 месяцев назад +3

      Another reason is memory optimization, but I believe most of C# devs never face with such necessity. When you really need it - you know it

    • @7th_CAV_Trooper
      @7th_CAV_Trooper 10 месяцев назад +2

      Unless you're compiling byte-aligned, which is a performance killer, it does not save memory.

    • @nickbarton3191
      @nickbarton3191 10 месяцев назад +2

      And comms packets, with fieldOffset and Marshall attributes, or masks, anything that gets close to hardware, and unmanaged code.

    •  10 месяцев назад

      @@nickbarton3191 Yeah, close to hardware is the only real reasons I can think of

    • @user-tk2jy8xr8b
      @user-tk2jy8xr8b 10 месяцев назад

      @@7th_CAV_Trooper it does. Consider struct S1 { short; long; int; long; byte; long; byte; } vs struct S2 { long; long; long; int; short; byte; byte; }.
      sizeof(S1) is 7*8=56 bytes, sizeof(S2) is 4*8=32 bytes. Same with classes. All "big" fields are aligned so memory access takes 1 read cycle, all "small" fields fit into 8 bytes so access takes 1 read cycle again. With no alignment certainly S1 would also take somewhat around 32 bytes, but the unlucky "big" fields would require 2 read cycles. Byte fields are byte-aligned and it's fine.

  • @Maskrade
    @Maskrade 10 месяцев назад

    The funny, is that Enums have extra things that means there is probably a runtime cost, that even if you used byte, it wouldn't go away at all

  • @dev.gustavo
    @dev.gustavo 10 месяцев назад +8

    Even though it doesn't offer us a significant performance improvement, I didn't really get why not to do that. Would that actually represent a performance decrease or something similar?

    • @burger_flipper
      @burger_flipper 10 месяцев назад +6

      On your 64bit platform, your cpu won't perform faster at copying byte than copying an integer, the only reason to do it it's for space optimization which is only a real thing to look at when you're dealing with big applications that have a huge databas.
      Code wise it's just annoying for people that will have to use your enum because it will 99% of the time being converted to an integer because you don't need it it to be a byte for your development purpose, which is even worse than having an integer in the first place

    • @monomanbr
      @monomanbr 10 месяцев назад +2

      Indeed performance can be affected negatively because of alignment issues. Typically objects/structs aren't compacted so that fields align with what the processor handles more efficiently, so in memory probably an enum will occupy 8 bytes, even if you change its base type to byte or short. If you force the compiler to pack the fields then performance can degrade substantially because fields may be split into two memory reads and writes. Even though the benefit in the database storage requirements comes it will penalize performance there too because of misalignment.

    • @dev.gustavo
      @dev.gustavo 10 месяцев назад +3

      @@monomanbr Could you please provide me some concrete references where I can learn about that? I'm really not being able to comprehend why such a simple thing could ever represent the opposite of what it should

    • @szynkie6710
      @szynkie6710 10 месяцев назад +2

      Up, changing enums to shorts have no sense, but I do not understand your anger, at least this tip does not have negative impact.

  • @aul7643
    @aul7643 10 месяцев назад +13

    Correct me if I'm wrong, but isn't the memory "aligned" or something like that, though? Like, using a 1 byte structure wouldn't be beneficial for memory because three other bytes would be "skipped" since objects can only be "aligned" every 4 bytes... I'm sure someone knows the correct terms for these concepts so, apologies for the ignorance... I just remember seeing something like this while working on some lower level stuff

    • @SacoSilva
      @SacoSilva 10 месяцев назад +1

      Not an expert either, but I don't think that happens in SQL, which is the point of the advice. But for starters, this is only relevant when using EFCore, and even then, you could just tell EF to use a byte column instead of forcing the enum to be a byte.

    • @mad_t
      @mad_t 10 месяцев назад

      Exactly! I'm glad someone remembers that.
      Using byte enums without using 3 padded bytes is stupid.
      If you have 4 enums in same structure then yes, this byte conversion will save you 12 bytes (enums will take 4 bytes instead of 16).

    • @yoshimaker-iot
      @yoshimaker-iot 10 месяцев назад +1

      @@mad_t Even more, you'd have to have 4 enums *and* have them sequentially defined in the structure *and* the first would have to be on a %4==0 boundary.

    • @MulleDK19
      @MulleDK19 10 месяцев назад

      No, this is wrong. Alignment is not applicable to single bytes. Types are aligned on multiples of their size, and since bytes are 1 byte any address satisfied alignment.

    • @SacoSilva
      @SacoSilva 10 месяцев назад +1

      @@MulleDK19 No this is not wrong. If you have, for example, a struct with an int field and an enum field you will spend 8 bytes per struct instance, regardless of the type of the enum.

  • @emloq
    @emloq 10 месяцев назад

    The main problem with reactions in LinkedIn is there is no "dislike" button, you can only "react", so, the minimun reaction to any post will be positive for the algorithm

  • @torqtorq
    @torqtorq 10 месяцев назад

    Not to mention that handling byte is actually slower than int. The fastest type obviously would be types whose size matches the bus width which is usually 64 bits now.

  • @Dustyy01
    @Dustyy01 10 месяцев назад +13

    I totally agree with you and how the post is written (being kinda misleading).
    But what is the downside of just making most of the enums you're using a byte (only if you 100% know that the enum has less than 255 values obv.) ? I don't see any disadvantage for that or I am wrong here? It's more like it doesn't matter yea but then it's also not bad to do it right?

    • @AlFasGD
      @AlFasGD 10 месяцев назад +3

      You might accidentally introduce padding in the memory alignment of your fields. If you have a class holding some enum fields, and you make some byte, some int, etc. it's probably going to end up aligning those byte fields anyway and wasting your 4 bytes of memory. Not to mention that most of the time you are still reserving all these 4 bytes in the register, so it's faster in some scenarios to give your register 4 bytes to begin with.
      On the other hand, if you want your structs to be highly compact in memory and absolutely optimized for 4-byte alignments, say you have a combination of two enum fields, and your only concern is comparing those combinations by interpreting the entire struct as a single 4-/8-byte value, then maybe.
      The consensus here is, it barely matters. If it does matter, prove it with benchmarks. If you prove it, make sure to make reasonable changes according to your domain constraints and always measure how this impacts your performance. Without measuring, nothing is 100% certain.

    • @Dustyy01
      @Dustyy01 10 месяцев назад +1

      @@AlFasGD appreciate the technical explanation a lot 👍

    • @ajdinhusic2574
      @ajdinhusic2574 10 месяцев назад

      So you're saying it can be better, but not worse?@@AlFasGD

    • @AlFasGD
      @AlFasGD 10 месяцев назад +1

      @@ajdinhusic2574 nope, I'm saying it can be either better or worse or the same. Introducing padding = more allocated memory per instance. Also, padding could hinder cache locality. Again, we're talking in the scope of nanoseconds and a few bytes, which you don't care about most of the time.

    • @ajdinhusic2574
      @ajdinhusic2574 10 месяцев назад

      @@AlFasGD thanks for the clarification!
      I didn’t get the worse part from your answer initially, because my thought process was, well if it pads to be 32 bits again, then its the same as the Int32 bit size. So can be ‘better’ but not worse.
      But thanks for mentioning it can in fact be slower than int/ more memory, I did not know that.

  • @patxy01
    @patxy01 10 месяцев назад

    I've always made my enums inherit short...
    Never really thought further about it. Some senior guy told me to do that 10 years ago and I never really thought through it... Yeah Carlos, you were the one telling it 🙂
    Anyway, I don't see why it would be bad to do it like that.

  • @djenning90
    @djenning90 10 месяцев назад

    One thing you didn’t mention is that enums when members of a class or struct, are usually 32-bit aligned on a 32 or 64-bit architecture. The data bus is wide, and when reading a byte, a whole word is transferred. Even if just a byte were written or read, it wouldn't happen in any fewer clock cycles than for a word. So in actuality there is neither a space nor a speed benefit to using byte over int. Except maybe for an array of enum, but I’ve never seen a use case for that, and don’t feel compelled to optimize for that.

  • @mahdiyar6725
    @mahdiyar6725 10 месяцев назад +1

    I Convert All my Enum To Byte Last Week !!!!😀

  • @fynnschapdick4434
    @fynnschapdick4434 10 месяцев назад

    I love your Code Cop videos Nick. Even if I didn't learn anything new, it entertains me immensely the way you get excited^^

  • @AntoineBernelin
    @AntoineBernelin 10 месяцев назад

    The enum doesn't need to be a byte for the database to use tinyint anyway. Just like one would most likely not let EF use longtext for every single string stored in db, the database column types should be handled in the DbContext configuration/Entity configuration/Entity attributes.

  • @jakubsuchybio
    @jakubsuchybio 8 месяцев назад +1

    Well I was writing a long comment and autoplay just discarted it...
    TL;DR - We have ECG Holter module with 15years of legacy C#. Where with 12days of 12lead ECG you get like 5GB of data (2byte samples, 500Hz) in 32bit process. We have tens or hundreds components to render the data in different ways. We do stream that data from disk and even then we are still hitting 2GB RAM limit of 32bit process way before 7days of 12lead ECG. So yeah maybe here byte enums might help if they are used for some parts of those samples a big way. So... Yeah... They would probably help us. And you lose nothing by using them. I don't understand your anger here. There just are some use cases for them. And rarely there are use cases when 255 values is not enought. So I don't see any wrong doing by using them.

  • @DiomedesDominguez
    @DiomedesDominguez 10 месяцев назад +2

    Ok, I do use byte in my enums, but not from an optimization perspective, but because I'm lazy. I don't even remember when or why I started using byte, probably from a requirement from an old tech lead in a decade or more years old project. It doesn't even affect the performance of the applications or the databases, but now I have muscular memory when creating enums. Sorry.

  • @ABC_Guest
    @ABC_Guest 10 месяцев назад

    "Just watched Nick's latest Code Cop episode, and as always, it's a goldmine of practical advice! 🌟 His take on the 'Enums as Bytes' craze really put things into perspective. It's fascinating to see how a seemingly minor optimization, like treating enums as bytes, can be dissected to reveal deeper implications on code maintainability and performance. Nick does a great job explaining why context is key and how what works in one scenario (like optimizing for database storage) might not be a silver bullet for every application. It's these nuanced discussions that make software engineering so intriguing! Thanks for another thought-provoking video, Nick. Keep demystifying those LinkedIn tips! 💻🔍 #CodeCop"
    _This is definitely not ChatGPT speaking - where did you get that from!?_

  • @homosuperior1337
    @homosuperior1337 10 месяцев назад +2

    Okay, the Key Message is, don't overoptimize. Got it, thank you Code Cop 🙂

    • @Dojan5
      @Dojan5 10 месяцев назад

      That and there are likely other places in the application one can make a difference in before this level of nitpicking ought be considered.

    • @homosuperior1337
      @homosuperior1337 10 месяцев назад

      @@Dojan5 okay the point you mentioned, i overheard, is overheard an english Word? I mean, i did not get it, but thanks.

  • @ifireblade09
    @ifireblade09 10 месяцев назад

    I think it still missed a key point. unless you are packing your bytes into one register etc. the OS still allocates higher amount based on if it's 64-bit or 32-bit runtime.
    I'd like to see say 10000000 allocations or something and follow the size difference between byte and say long if you are running 64-bit. My hypothesis is you wouldn't see a difference at all.

  • @yoshimaker-iot
    @yoshimaker-iot 10 месяцев назад +2

    I do a fair amount of typing enums as bytes - I'd go so far as to call it "fairly common" in my code. HOWEVER, it's typically used when doing things like defining device registers when I need to serialize or deserialize it from/to a bus where the size matters due to data alignment, not for any sort of savings. So defining an enum as a byte makes good sense in specialized cases, but if you're writing those cases, you already know why and wouldn't call it a "tip" but a necessity for those narrow cases. The advice, as given, is just dumb.

  • @OhhCrapGuy
    @OhhCrapGuy 10 месяцев назад +1

    I've only used a non standard backing type for an enum a few times, but every single time it's been UInt64, not (S)Byte or (U)Short.
    Why? Because I had more than 32 values and it was being used as Flags, so I had more than 2^31 theoretical actual values.
    You might be thinking "But in what world do you ever need that many flags?!"
    Letters and numbers. I needed to store all of the letters and numbers that were present in a string as a precalculated value that could be checked for maximum similarity between strings in O(1) time using some bit ops. Made the program about 7 or 8 orders of magnitude faster, and all it cost was O(n) additional space complexity.

  • @allfre2
    @allfre2 10 месяцев назад +1

    Micro optimization ?

  • @billy65bob
    @billy65bob 10 месяцев назад +1

    I just set them to match the database type.
    If it's a 'tinyint', even when it makes 0 sense, then the one in code is a byte.
    if it's a 'bigint' for a classic non-flag enumeration, then it's a long, even though there is no way I'd ever get anywhere close to exceeding 2 billion.
    Unless you're dealing with many thousands of usages and objects, this level of optimisation will save you less than the runtime uses to even represent your enum in the application (with interned strings and everything), and that's assuming you're paying attention to alignment (typically 4 or 8 bytes) or abusing [StructLayout] enough to actually even benefit from such optimisations.

  • @adamstawarek7520
    @adamstawarek7520 10 месяцев назад

    I'm guilty of this myself. I set up once default conversion for enum to tinint in EF, thinking it was small but nice optimization.
    I've forgotten that one of the enumerators had negative values and this micro optimization cost me about 2 hours of debugging trying to figure out why some value is 252 (or smth like that) out of nowhere 😅 Angry at myself I reverted it back to integers

  • @SomeTechyPerson
    @SomeTechyPerson 10 месяцев назад

    I have inherited code where people have done this. Just as you say, there are hundreds of places in the same repository which could be optimized to save more space or lead to better performance.

  • @tymurgubayev4840
    @tymurgubayev4840 10 месяцев назад

    in very rare cases, when implementing some pre-defined APIs, I had to use a different underlying enum type, but it was `uint`. I have never ever (consciously) used anything smaller than `int`.
    Also, while it technically does save you 3 bytes per value, I *think* it can harm the performance, because (most) modern CPUs work better with 4 or 8 byte values than with single bytes.

  • @StevieFQ
    @StevieFQ 10 месяцев назад +2

    Did something like this except i used ulong for the enum. And the enum was [Flags] decorated. And we were actually running out of flag values.

  • @mbrdevuk
    @mbrdevuk 10 месяцев назад

    I used “long” on a flags enum once, haven’t needed to go the other way before. Never say never, but also never say always!

  • @marceloleoncaceres6826
    @marceloleoncaceres6826 4 месяца назад

    Thanks for sharing your time and knowledge,

  • @IanGratton
    @IanGratton 10 месяцев назад +1

    Nick - I think you need to wear a blood pressure monitor and show us the before/during reading when you do the Code Cop videos 😂
    I remember - a long time ago now - the only place where you could grow your understanding was the language specification, the quarterly MSDN, the library vendor, your more experienced peers (if you had them) or a book or course from a highly regarded SME (if such a thing existed). If all that failed - do the time - experiment and figure it out - the ‘science’ bit of computer science.
    The internet is sometimes a great place for fact - and equally a dumping ground for positively re-enforced nonsense. It’s really unfortunate that cut/copy/paste coding became prevalent - but remember - our industry enabled that! With that often comes the lack of desire for many to want to understand the how and why. If something you find solves your problem PLEASE take the time to appreciate why or if it even really does.
    What exciting times we live in…maybe its time to go re-watch Mike Judge’s 2006 film Idiocracy…
    LLM vendors...you are 100% not blanket using these sites for your training models - right ?

  • @travisabrahamson8864
    @travisabrahamson8864 10 месяцев назад

    I personally would only do this on the very rare occasion that I would need the enum value to be a byte, either because the message required it or the db, and that would be only to avoid casting and assure type safety. But I've been developing for 30 years and don't even need one hand to tell you how many times I've run into that situation.

  • @F1nalspace
    @F1nalspace 10 месяцев назад +1

    Okay i get it and "almost" agree with you, that this advice is for the most part total gargabe.... but what if i say there is a legit way to change the type of an enum?
    Yes there are:
    If i want to layout data in unions, that contains different data based on a an "enum" type - but wanting the union to have the same length, say. 16 or 32 KB. In that situation, the size of the enum matters a lot due to data-alignment and cache coherence.
    If i just use default enum (32-bit integer), then i am wasting 24 bit because i most likely dont have enums with more than 10 values. In that case using a byte is totally legit and then using another byte and a short after, to layout the data efficiently can make a difference in performance and even in stability for multi-threading use-cases, due to false-sharing or in-between cache-line issues. Also there are windows api functions, that uses shorts or bytes for enum values as well, so in that case you better use a fixed defined enum as well, so that it patches in your definition.

  • @tioluiso1
    @tioluiso1 10 месяцев назад

    Hello Nick. Great content. Just a minor stupid thing: I have done some courses on SQL Server query tuning, and as a principle it is actually advisable to use the most compact possible representation for your data. The reason is not that it will take less disk space. But usually, when you do have to do a query, the engine will load pages of data, and simply put, the bigger the size of a record, less records you will be able to retrieve in a single page, meaning, more IO. And you generally try to reduce that logical IO as much as possible
    Now, having said that, you're completely right. We're talking about 3 bytes for each enum. Unless your table is basically full of those enum flags, we're talking about peanuts. I bet there will be better optimizations

    • @paulkoopmans4620
      @paulkoopmans4620 10 месяцев назад +3

      Yeah; at 1 Million rows, the 3 bytes difference is going to save a "whopping" 2.86 MB. The principle of relational databases was indeed always about using the tiniest possible footprint. Like in 1970 when 100Mb would go for $26,000. When relational databases where invented we were concerned about storage. That goes for both physical and memory. Therefore using the smallest types and remove duplication was KING.
      We are not there anymore. Today, most of the sql servers users out there just create a database with the default settings. A VERY, VERY small subset is actually able to and benefit from changing the page sizes and picking data types so that data lines up and reads from storage are going to line up with your systems data bus size and your L1,2 and 3 caches, for a perfect, zero waste, read.
      With all the other factors of not knowing on what hardware and storage your files end up on in the cloud... looking at the byte level for some numbers is the worst place to look at. In light of the advice you can imagine someone might go and create TINYINT status columns right beside a NVARCHAR(500) status message column, or some other NVARCHAR columns elsewhere that are "oversized" just in case? Not many know this either; but is not number of characters but number of byte pairs. An NVARCHAR(200) is a 400 byte allocation.

    • @daniellundqvist5012
      @daniellundqvist5012 10 месяцев назад

      @@paulkoopmans4620 quite noob-common with EF is to use string without specification resulting in nvarchar(max) columns. THAT is stupid.

    • @paulkoopmans4620
      @paulkoopmans4620 10 месяцев назад

      @@daniellundqvist5012Yeah.. That too. EF should come with it's own set of code analysis to stop the person from making those.

    • @tioluiso1
      @tioluiso1 10 месяцев назад

      ​@@paulkoopmans4620 Absolutely. I have worked with DBAs on this kind of optimization, and I have seen many talks from really good people whose focus is just that, to optimize DB. And while the principle generally stands, you always have to ask yourself how big is the effort for that miserable gains. There are usually bigger issues to tackle. That being said, I have also seen some tables with 50 flags like these.

  • @adambickford8720
    @adambickford8720 10 месяцев назад +2

    I'm in a GC'd, vm based language running in a container... every bit counts!

  • @Thorarin
    @Thorarin 10 месяцев назад +2

    This one isn't so terrible compared to the other ones, imo. You're never going to reach anywhere near 256 different enum values for many uses.
    However, I think you don't save RAM in all cases due to memory alignment stuff that I admittedly know little about.

  • @Alguem387
    @Alguem387 10 месяцев назад

    I used byte and short enums inside structs that model packets where there are protocols

  • @Cymricus
    @Cymricus 10 месяцев назад +1

    This advice comes from back in the day when we had 2MB RAM to work with.

  • @minarikp
    @minarikp 3 месяца назад

    Hi Nick, I keep watching your videos as I can learn useful things from them. I agree, that the majority of the time you should be fine with 32-bit storage for an enum and only a few special cases need to limit it to say 8 bits. However, I think you got a bit too emotional on this one for no obvious reason. I think it's more professional if you keep it cool.
    Thank you and keep the good content flowing!

  • @winchester2581
    @winchester2581 10 месяцев назад

    We should get CodeCope series from such creators on LinkedIn as a response to Nick's video series

  • @rauberhotzenplotz7722
    @rauberhotzenplotz7722 10 месяцев назад

    I do it rather the other way: when going to the database, the enum values become strings (not the internal C# name, but an explicit, well-defined one, e.g. via attribute). I consider the numeric value of an enum a implementation detail.

  • @buriedstpatrick2294
    @buriedstpatrick2294 10 месяцев назад

    We also already have the amazing [Flags] attribute to do bitwise AND/OR'ing on enums.

  • @gaborpinter4523
    @gaborpinter4523 10 месяцев назад +1

    I've done this, but in that case my code was interacting with a microcontroller that has only 512KBs of memory, and the C# code serialized a pretty big data structure, that contained a lot of enums and stuff. But it's hard to imagine any other kind of situation except for embedded stuff in 2024.

    • @MaximilienNoal
      @MaximilienNoal 10 месяцев назад

      I use it in a IBM PC emulator, but that's like the only other use case.

  • @wknight8111
    @wknight8111 10 месяцев назад

    On one hand this advice *could be good in some situations*, but in a lot of places fields in a struct or class are going to be aligned on a 4-byte boundary, so you don't end up saving any space. Using a TINYINT or BYTE instead of an INT in your database could be a win over large numbers of rows, but there's often a far bigger gain to be had by limiting your strings from a default VARCHAR(MAX) to something more reasonable (which many people don't bother to do, because EF just makes all strings VARCHAR(MAX) by default). Like many "optimization" opportunities you really need to profile and measure to make sure you are actually saving something, and I imagine a lot of the people sharing this "advice" haven't profiled or measured anything. You cannot optimize by assumption alone

  • @11keshav11
    @11keshav11 10 месяцев назад

    This video has so much frustration and sadness for the LinkedIn community. I could feel the range of emotions and helplessness 😛

  • @nordgaren2358
    @nordgaren2358 10 месяцев назад +1

    The byte might get padded, anyways, because things need to be aligned in memory.

  • @wolfgangdiemer2511
    @wolfgangdiemer2511 10 месяцев назад

    I have some WinAPI calls (native stuff), where i have to marshal some c byte constants, which i converted in a c# byte enums. And here, the size is important, cause otherwise memory structure is not matching anymore. But of course, this is a special edge case.

  • @jamesbond6761
    @jamesbond6761 10 месяцев назад +1

    Actually, I did extend byte for my enums when I encoded data and codec expected my value to be 1 byte integer.

  • @flybyw
    @flybyw 10 месяцев назад

    I used to do that in SQL with EF, but now store strings in MongoDB.

  • @macoson
    @macoson 10 месяцев назад

    This may makes sense, possibly only in Unity ECS where entity size is limited to 128bytes

  • @JustFor-dq5wc
    @JustFor-dq5wc 10 месяцев назад

    On 32/64 bit processors 8 bit vale takes 32 bit register anyway (if i remember correctly). So it will give you none performance boost. If you have very large number of records that can save some memory.

  • @tubaviewa2624
    @tubaviewa2624 10 месяцев назад

    afaik one should always use the native data type for processing (enums) to get best performance. if one really want to one can convert back and forth to get an optimal memory footprint when storing or serializing, but don't do that in memory. it hurts performance.

  • @jalzees
    @jalzees 10 месяцев назад

    Such a great video!

  • @Zutraxi
    @Zutraxi 10 месяцев назад

    I don't quite know if it is true. But a smart person told me once that computers are heavily optimized for ints, so using smaller types could be seen as an anti pattern.

  • @programmingCZ
    @programmingCZ 10 месяцев назад

    well, C# uses int32 all over the place, you will most likely cast it to int32 anyway if you really need it as numeric value for something

  • @haxi52
    @haxi52 10 месяцев назад

    I don't think most people know that values in memory have to be aligned, and registers in cpu have specific sizes. Byte doesn't actually save any memory, and in some cases might actually be slower. Also, I store the string values of my enums in db!! lol. Space is cheap. Your time is not. If you need that level of optimization C# is not your tool.

  • @SacoSilva
    @SacoSilva 10 месяцев назад +4

    It's advice like this that makes me want this series to go the name and shame route.

  • @StarfoxHUN
    @StarfoxHUN 10 месяцев назад +3

    It might be just me using a different/wrong practice, but when i use Enums, i usually even use negative numbers to show "Bad" states/values. This makes handling and reading the enum a bit easier for me at least, but would be impossible or at least much more confusing to do with a 0-255 range.

    • @7th_CAV_Trooper
      @7th_CAV_Trooper 10 месяцев назад +3

      Negative values for error state seems like a good idea to me.

    • @Briezar
      @Briezar 10 месяцев назад +1

      if you're being consistent then it's a good practice

  • @oxin8
    @oxin8 10 месяцев назад

    I'd argue that the disk/network usage for the additional " : byte" in source control across a company will outweigh the benefits.

  • @aossss100
    @aossss100 10 месяцев назад

    Converting all of your giant enums to tiny sized enums doesn't even give a guarantee it will occupy exactly this desired TINYINT in the process memory :) Modern CPU doesn't give a shit about LinkedIn advice, it is designed to be super optimal dealing with machine words as "default" data type. It is more likely that you will introduce few more redundant machine instructions like movzx/movsx in the emitted code by enforcing your enum storage type to the System.Byte type. So, good luck to all followers 😅

  • @banned_from_eating_cookies
    @banned_from_eating_cookies 10 месяцев назад

    Agree 100%. The comment at 5:47 reads completely like it was generated by Chat-GPT, it has the same tone, structure and verbosity.

    • @Sindrijo
      @Sindrijo 10 месяцев назад

      "Your use of a byte as the underlying type for an enum is a sophisticated and elegant approach to solving your desire to feel cool like those hard core bit-banging low level programmers you have an inordinate and unexplainable admiration for, even though they are slightly more muskier than you. Please like me, I'm your best AI assistant, friend, soulmate."

  • @AlexBroitman
    @AlexBroitman 10 месяцев назад

    Using byte or short for enums may harm performance, because CPU will may need to ajust it to int32 and back to byte every time.
    Sometimes I used enum based on long with [Flag] attribute, when you need more than 32 flags
    But never byte or short.
    CPUs works with int32.

  • @rasheedmozaffar5811
    @rasheedmozaffar5811 10 месяцев назад +2

    I was on linkedin yesterday and saw this advice, I was telling myself is it gonna go on code cop or not, and now just opened youtube, first recommended video. Lol

  • @jblack1396
    @jblack1396 10 месяцев назад

    I will assume you have never worked with IoT devices or, in xamarin, creating a large grid. I have done both and if we start with memory-constrained devices, this could be helpful, but, before you optimize profile and see where you may need to make improvements.
    If I just need an enum in a couple places probably not worth it, but if I am transmitting data from an edge sensor to a controller it may be useful.
    In a game, I may use an enum to specify which type of terrain in a cell and I may have millions of these, so saving a few bytes could be helpful, esp when I want to save the map, to reduce the size of the file.
    You shouldn't optimize prematurely but there are times when something like this may be useful and just shooting down the idea without considering when it might be useful is just bad form, IMO.

  • @nikolayzdravkov1378
    @nikolayzdravkov1378 10 месяцев назад +1

    You probably shouldn't optimize to that level. But when your DB type is defined as tinyint (byte) I would better follow the same memory layout in code, to avoid any marshaling problems.

  • @casperhansen826
    @casperhansen826 10 месяцев назад

    I once made the terrible mostake to store the weight of a person in a byte in the database, and then some wanted the weight in lb, never trying to save a few bytes again

  • @petrusion2827
    @petrusion2827 10 месяцев назад

    LOL, I wasn't prepared to hear Nick's "yEaH THaTs A GoOD AdVIcE HEeHeeheEhe". He's normally so eloquently spoken, I did a double take hearing him speak that way. Good video

  • @7th_CAV_Trooper
    @7th_CAV_Trooper 10 месяцев назад

    People suggesting using non-word numeric types to save memory don't understand how memory is allocated, aligned and how reading/writing is optimized.

  • @nitrous1001
    @nitrous1001 10 месяцев назад

    Enums have always been a curse that is iredeemable that requires a major breaking change to happen.
    The fact that people have to write source generators to improve performance and memory use is already a sign that it is.