I think I decided not to comment about ULID due to the sheer difference, but it's nice to see it being mentioned. I personally do use GUIDs for ids, but I use ULIDs for public ids that are shared publicly. ULIDs are very easy to use as public ids since they are short and can also easily be used in a URL.
The main purpose of uuid came in order to prevent enumeration, in turn make public data sets more secure. SQL still is the most performant when using PKs ( as far as i know anyway ) so the way i deal with this is at Component level you can use whatever identifier you want, DB level you join and index via PK and presentation layer only had access to uuid. To me its just straightforward instead of adding a bunch of different hashes
uuid's in RDMS is fine, use them to avoid enumeration and easy number based lookups/attacks. Don't use them for relationships. Easy. No need to re-invent the wheel
If you are creating your v7 Guid (or Ulid for that matter) in distributed computing like multiple instances of a service running on multiple machines or for that matter if you generate it on a local client, then you can't be sure they are sequential because time might differ between machines. Also there is a sequential UUID in MS Sql server if you generate the ID on db server side. Buit one point of using a Guid is so you can generate the row "client side". If you want a server assigned id then I personally think it might be better to use a sequence.
One thing to keep in mind is that postgresql and mssql don't order guids the same way. Sql Server groups the bytes together in a certain way when ordering. To my understanding ulid is good for inserting into a postgres database but not right for sql server. I'm using the RT.Comb package for generating sequential guids for use with sql server.
Whether that's useful depends on the UUID version. Some benefit, others don't. UUID v4 is random so can't benefit from reordering unlike v1,2,3,5. UUID v6 (equivalent to v1) and UUIDv7 are already sortable in their original byte order. Having a database with a native UUID type also helps with minimising storage space.
You are right. ULIDS lose their order of creation sortability if stored as UUIDs on SQL Server. They are only sortable if stored as CHAR(26), which is not an ideal size for a clustered index.
@@SlyEcho That is the best sequential UUID implementation I know of. You are talking about the one with the sequential bytes at the end of the UUID, right? It solves both the problem of having the UUID remaining sequential in a database like SQL Server, and it actually reduces fragmentation by a meaningful percentage. ULIDs do neither. NHibernate has a nice implementation. That's when I first saw it. I still argue we overthink index fragmentation, though. Especially if you never expect your table to grow beyond a couple million records, or you are storing on the cloud. In the latter case I would worry more about the size of my index table instead and the cost that is bringing to my company. UNIQUEIDENTIFIERS are way too big.
@@Marfig Yes, the byte array part contains some form of date-time, because SQL server orders by this part first. But also, you should never expect your tables to "only contain a couple million rows". That kind of thinking has caused a lot of problems in the past for us at least.
man, that’s really neat. thx, didn’t have that on my radar. heck, was using hilo even most of the time to not get fragmented and/or massive i’d values just to be able ensure uniqueness. will see how that one fits in with new stuff. still creates large keys, but the simplicity might be well worth it
But I need to add extra integration for using it with literally everything that uses GUIDs out of the box. Adding a lot of complexity. Totally not worth it.
Exactly. As I understand it UUIDv7 was directly inspired by ULID but with the version/variant tweaks to make it a valid UUID. Slightly less entropy so more chance of collisions but compatible with existing UUID software ecosystems.
Nowadays some operations in modern PLs and CPUs consumes not even microseconds, but nanoseconds... and we still use milliseconds timestamps considering they are unique enough?
Despite Ulid is faster and smaller, I think in terms of reuse of available tools it's not worth it. It's not RFC, depending on context, you'll have to customize or create tools. Seems to be a big price to get little extra performance. Use UUID v7 that have all the features of Ulid and is large used.
Hi Nick! Thanks for the video, never heard about ULID. You recommend to not use UUIDs but if I didn't miss anything you didn't cover how fast ULID works in the database. Lets take Postgres for example. I assume we store ULIDs in a text column. Sorting 5 millions rows by text ULID column takes twice as long as sorting 5 million GUIDs (typical use cases: create materialized view optimised for reading in the sorted order, select distinct, etc). Text ULIDs also consume more space. It looks like they are considerably worse for the database in both storage and operations than UUIDv7, but more optimised when generating them. So I would say it depends on the usecase.
Until the database engine has support for it, there's no reason to not either a) convert to guid and store as uniqueidentifier, or b) store it as binary(16).
Don't expect much. This 2 video series on UUIDs as primary keys are just not high quality and are very low on details, to the point of being useless and frankly bad advice. Your questions are important, and reflect the limitations of UUIDs databases. But unfortunately there's no discussion on these videos about the actual usage of these keys. I've been working with databases for more tan 30 years, and we just know that we cannot resume a discussion on "use this instead of that". There's more to your primary key choices of size, data type and data format. Fragmentation is not an absolute evil, for instance, and it depends on your expected table size and were you are storing it (cloud vs local). Another example, what is true for INSERTS may not be for SORTS, etc. Just don't comes here for those answers.
@@j2csharp Well, to answer your initial question more directly, UUIDs and ULIDs are great for in-memory processing. But terrible for indexing tables. On the case of ULIDs they can also be stored as UNIQUEIDENTIFIERS. They are compatible. But it requires you decoding the ULID string into a byte array and then creating a new GUID using the byte array in the constructor. However you will lose the lexicograph sorting capabilities of the ULID. Not useful at all and you will need more CPU clocks to generate an ID. Storing as a CHAR(26) can be your option if you don't care about clustered index size, or, as you noted, the impact on sort and table inserts. So what should you do? Well, if you just need creation time sorting capabilities, use a CreatedAt column. If you need non-guessable unique IDs use NanoId which allows you to select your key alphabet and size, giving you a lot more room for your table to grow before you start experiencing the same problems as UUIDs. Or use CUID2 which is larger but the most secure of the UID generators. If you need non-guessable sortable IDs, there's no easy way, you'll need to implement a GUID Comb strategy in your code, making sure that the last part of the GUID that contain the timestamp. Or, you use text ULID with the problems you know. So, what is the better option? The thing that these videos keep not telling you, is that it depends on your table, how the data is used, and what is your database. There's no one solution fits all *sigh* BTW. One last note. When using UUIDs as database primary keys, it's actually bad for a UUID generation to be too fast. They should be fast, just not too fast. And this is particularly true of sortable UUIDs, because they are weaker to guessing and can more easily be the target of brute force attacks. So, as a matter of fact, ULID speed plays against it when using on databases as row identifiers.
I made my own Guid V4 because of this video. I did the timestamp similar to V7, but on the right side and included also nanoseconds. This way it's sortable also in SQL-Database and it's unique. The datetime variable is not unique. This sortable unique key is very handy for some tasks and simplifies the tasks. Sometimes I need to know the exact order the records came in. Then I used an order-field with long. The problem of the overflow has to be addressed.
Loved it but I already using a custom version 7 like guid. What I need is a way to store a 124bit int in Oracle/SQL/PostgreSQL that is good on performance.
I still rather use the ints and I have a simple guid as "unique-id" that I usually use to sync between bases and expose when necessary. My internal Id never leaves my database.
Can we trust it? Will this package get support? Will this package validate correctly against buffer overflow? Just because it uses SIMD doesn’t mean it’s “so, so good”, to me “so, so good” means “so, so reliable with such a great support “.
I think you should also mention how to spot what is a v4 GUID or v7 etc. It’s very easy because there is always a -4 in the middle and so on. Once you see it you will always see it
They are not better in anyway, than any plain unguessable token. Stop sayning what time+rand() is id. Any good database already generate ids, including sythetic uuids for two decades. Base32 is bad schema when compared to base36 or base62, and both can be implemented in very performant way. So, compared to GUID is not better: is not GUID, is plain rand token which you cant trust in same way as GUID. So, application is actually possible, but adding custom name like MySUID/ULID did not make it better in any way, than existing id schemes. At least it did not worse to cry what it is better. It is not, it dumb old scheme.
@@VanDameDev i glad what at least one people heard me. There is exist similar comments about, and I'm sorry what i ranting - i can do it endlessly. :) probably i am too old and when i was young software had been limited, but overally better. :) Thank you.
It would be very interesting if we could generate ULIDs directly on T-SQL or even PL/SQL. We have that with integers and GUIDs, and it's an important feature to have on any B2B software, that eventually requires manipulation of data directly in the database via stored procedures and alikes. I'm still not sold on any idea that, for non-distributed databases, anything other than integers is better for PKs, but maybe I'm just old.
Is it going be a best practise if I replaced all my GUIDs in SQL database with ULIDs of a type char(26)/varchar(26) knowing that most of those IDs exposed by APIs for CRUD operations? what I understood from the video the performance should be even better and more secure, I'm I correct?
Likely not. GUIDs are stored as 16 byte Ints in MS-SQL, which should perform better than 26 byte varchar. When using GUIDs in MSSQL, they are only a performance problem if they are being generated in .NET (not on the SQL server) and are being used as a clustered index.
@@StateHasChanged Exactly. Clustered Index is the key point here. Having a good level of knowledge around indexing and clustered indexing is important here. Take a look at this video, which is full on DBA territory, but developers need to know how this stuff works in order to get the best out of their software ruclips.net/video/rvZwMNJxqVo/видео.html
Are the Guids generated from a Ulid still sorted by time? In manycases Ids also have to be created on the database. You would probably need a dll as user efined function to achieve this.
@@nickchapsas Depends on the circumstances. There might be scenarios where for example a stored procedure or trigger generate rows directly on the database. Or when you are fixing faulty data (should not happen but it does).
That’s interesting, why would you not use SPs? I use them fairly rarely but I thought they were considered a good way to restrict what a 3rd party can do in your DB?
@@ghaf222 Mostly for calulation intensive business logic and reporting, which would be too slow when not done directly on the database. Admittedly, these are mostle read operations but still, I would not completely exclude such a use case.
@@CrashM85to elaborate on "mostly". The nibble at bit index 48 (0 being leftmost MSB) indicate the Guid version, e.g. the default random v4, and the new v7 time+random. The ToGuid method doesn't specify the version nibble at all, so it's just going to be whatever was at that location in the Ulid. So about 1/16 of the values would be valid v4 and 1/16 would be valid v7. Technically 1/16 would also be valid v8, since those are domain specific and can be defined any way you want.
@@projectuber Yeah but you generally don't use a Guid as shard key and at least with MongoDb the _id is normally a 12 bytes objectId so also not a Guid.
@@premiersi I just asked because i only care about minimal apis and different ways of using them. I only use dotnet for backend and i put my brackets on the same line of the signature 😂. So im already a blasphemous coder going against the dotnet grain but i love me some minimal apis though 😉. Sorry if i disrespected any tru dotnet people and yes i am using a macos haha
I thought that the performance would be discussed on the DB level. For example, sorting of 10 mln records. It is still unclear what type should be created in the case of each DB type.
It depends on use case. It is very much possible that you need to generate many ids at once, and even if this is not the case it's good to see there being interest in making it as fast as can be.
I'd not use an external library if a similar feature can be achieved in the standard . NET. Performance in generating a GUID is negligible to compensate for using a third-party library.
Cysharp.Ulid doesn't seem to be thread safe. There's a race condition in RandomProvider+XorShift64.Next(). See below ¹. It's also very simple compared to System.Random, but I don't know if the algorithm is good enough or not. It's probably a large part of the performance improvement overv Guid. I'm also not sure why they did it like that in the first place. The spec seems to call for it to generate a random value only the first time a Ulid is requested, or first time per ms, then incrementing that value by 1 for each subsequent new Ulid. In short, the spec would have the Ulids in the same ms also in correct sort order, while in the C# version the order of the ids created in the same millisecond are random. It does make it impossible to guess the next ID though, unlike in the spec. ¹ UInt64 x = 88172645463325252UL; public UInt64 Next() { x = x ^ (x > 9); }
Nothing. However, there are times when you can't use an auto-incrementing number so Guids are often used. The downside is that they are not sequential, so you have to rely on a datetime stamp or something else to maintain chronological order. It's rare for me, but Sitecore is a CMS platform that uses Guid as PK for good reason. Sequence is not an issue for them, so V4 Guids work fine.
So, removing the randomness from GUID is a good thing? Has anyone else thought of an attack that could be assisted by knowing how busy a server is? Any developer or IT manager who suggests the use of these should be demoted. No one running multiple parallel systems is going to want a GUID that can only be 20/32 as unique as a 32byte GUID.
@@stignielsson2697 if you shorten the random part of of a GUID, you remove its randomness and risk duplicate IDs, without some additional checking. No one running a large cluster is going to want to make a DB query before using a GUID.
Video: “Don’t use UUIDs, use these instead” Me: I swear to god, if this is yet another video bashing UUID for reasons that are either outdated, negligible, or irrelevant to real world applications and recommending yet another replacement with no industry support and no reason to exist other than performance for performance’s sake (and bonus points if their complaint focuses on the string form of UUID even though that isn’t the real UUID representation)… Video: “Use ULIDs, they are so much faster to generate than UUIDs and are sortable.” Me: I rest my case, your honor.
With what type is this used in the database, the DB only supports guid. For example, postgres. If we do ToGuid() before writing to the DB, what is the benefit then?
You do realize that NHibernate and other good ORMs have had strategies for sequential GUIDs for over a decade right? Maybe this is better, but it did the same basic thing.
millisecs level resolution is not sufficient these days. are the ULIDs generated within a millisec still time-sortable? we really should move to nanosec resolution already; there are industries where million messages per second on single CPU core is also turning out to be insufficient.
Can we have a video integrating the Ulid library from Cysharp with EF Core/ Dapper? Author provides some documentation, but I think it can be more described
Is there a reason you need a global unique id over the much simpler autoincrement/identity integer id? Unless you have multiple databases that can't share the same ids.
@@thumcheechon5081 This. Also Microsoft keeps adding and enhancing more and more C#/dotnet-features in VS Code, which makes me believe that at some point they will actually phase out VS in favor of VS Code. Now although VS Code has a lot of practical advantages to VS (its generally faster, runs on all OS'es, is more in open-source space and with that has a very wide user base already), I still prefer 'full fat' VS. Not just for C#/dotnet.
Been a lifelong VS Pro/Ent (20+ years) user until I gave Rider a shot about a year ago, and haven't looked back. It's better in nearly every single way, and less than half the price. Seriously though, I even stopped using SSMS (or PostgresAdmin) and just have a separate Rider instance open for database queries and work. The bells and whistles seem like a lot at first, but once you start getting familiar with them you realize they are each very well tested and thought out.
I think I decided not to comment about ULID due to the sheer difference, but it's nice to see it being mentioned. I personally do use GUIDs for ids, but I use ULIDs for public ids that are shared publicly. ULIDs are very easy to use as public ids since they are short and can also easily be used in a URL.
ULIDs can be predicted during a middle man attack
I am using UUID for public, id (int) for internal
That’s amazing. I’ve been using Ulid for ages, but able to use it in existing systems is awesome
The main purpose of uuid came in order to prevent enumeration, in turn make public data sets more secure. SQL still is the most performant when using PKs ( as far as i know anyway ) so the way i deal with this is at Component level you can use whatever identifier you want, DB level you join and index via PK and presentation layer only had access to uuid. To me its just straightforward instead of adding a bunch of different hashes
uuid's in RDMS is fine, use them to avoid enumeration and easy number based lookups/attacks. Don't use them for relationships. Easy. No need to re-invent the wheel
So, just store them as an additional column like cPublicID CHAR(26) with index?
@@MrSupasonik most rdms's have a uuid type and yes, unique index ... use that to query externally, only use int relationships on joins, etc.
If you are creating your v7 Guid (or Ulid for that matter) in distributed computing like multiple instances of a service running on multiple machines or for that matter if you generate it on a local client, then you can't be sure they are sequential because time might differ between machines. Also there is a sequential UUID in MS Sql server if you generate the ID on db server side. Buit one point of using a Guid is so you can generate the row "client side". If you want a server assigned id then I personally think it might be better to use a sequence.
One thing to keep in mind is that postgresql and mssql don't order guids the same way. Sql Server groups the bytes together in a certain way when ordering. To my understanding ulid is good for inserting into a postgres database but not right for sql server. I'm using the RT.Comb package for generating sequential guids for use with sql server.
Whether that's useful depends on the UUID version. Some benefit, others don't. UUID v4 is random so can't benefit from reordering unlike v1,2,3,5. UUID v6 (equivalent to v1) and UUIDv7 are already sortable in their original byte order. Having a database with a native UUID type also helps with minimising storage space.
You are right. ULIDS lose their order of creation sortability if stored as UUIDs on SQL Server. They are only sortable if stored as CHAR(26), which is not an ideal size for a clustered index.
This should be addressed, TBH. We are an SQL Server shop and have been using some “GuidComb” class from some ancient blog since 2009 or so.
@@SlyEcho That is the best sequential UUID implementation I know of. You are talking about the one with the sequential bytes at the end of the UUID, right? It solves both the problem of having the UUID remaining sequential in a database like SQL Server, and it actually reduces fragmentation by a meaningful percentage. ULIDs do neither. NHibernate has a nice implementation. That's when I first saw it.
I still argue we overthink index fragmentation, though. Especially if you never expect your table to grow beyond a couple million records, or you are storing on the cloud. In the latter case I would worry more about the size of my index table instead and the cost that is bringing to my company. UNIQUEIDENTIFIERS are way too big.
@@Marfig Yes, the byte array part contains some form of date-time, because SQL server orders by this part first.
But also, you should never expect your tables to "only contain a couple million rows". That kind of thinking has caused a lot of problems in the past for us at least.
man, that’s really neat.
thx, didn’t have that on my radar. heck, was using hilo even most of the time to not get fragmented and/or massive i’d values just to be able ensure uniqueness.
will see how that one fits in with new stuff. still creates large keys, but the simplicity might be well worth it
Besides the base32 encoding, it is the same as UUID v7. both are 128 bits, ULID just do not have the 4bits version and 2 bits variant part.
But I need to add extra integration for using it with literally everything that uses GUIDs out of the box.
Adding a lot of complexity.
Totally not worth it.
Exactly. As I understand it UUIDv7 was directly inspired by ULID but with the version/variant tweaks to make it a valid UUID. Slightly less entropy so more chance of collisions but compatible with existing UUID software ecosystems.
Nowadays some operations in modern PLs and CPUs consumes not even microseconds, but nanoseconds... and we still use milliseconds timestamps considering they are unique enough?
Despite Ulid is faster and smaller, I think in terms of reuse of available tools it's not worth it. It's not RFC, depending on context, you'll have to customize or create tools. Seems to be a big price to get little extra performance. Use UUID v7 that have all the features of Ulid and is large used.
This can be useful for Azure Table Storage as well since data is stored ordered by partition key then by row key.
Hi Nick! Thanks for the video, never heard about ULID.
You recommend to not use UUIDs but if I didn't miss anything you didn't cover how fast ULID works in the database.
Lets take Postgres for example. I assume we store ULIDs in a text column.
Sorting 5 millions rows by text ULID column takes twice as long as sorting 5 million GUIDs (typical use cases: create materialized view optimised for reading in the sorted order, select distinct, etc). Text ULIDs also consume more space.
It looks like they are considerably worse for the database in both storage and operations than UUIDv7, but more optimised when generating them.
So I would say it depends on the usecase.
Until the database engine has support for it, there's no reason to not either a) convert to guid and store as uniqueidentifier, or b) store it as binary(16).
@@billy65bob at this point it’s better just to use UUIDv7
Don't expect much. This 2 video series on UUIDs as primary keys are just not high quality and are very low on details, to the point of being useless and frankly bad advice. Your questions are important, and reflect the limitations of UUIDs databases. But unfortunately there's no discussion on these videos about the actual usage of these keys. I've been working with databases for more tan 30 years, and we just know that we cannot resume a discussion on "use this instead of that". There's more to your primary key choices of size, data type and data format. Fragmentation is not an absolute evil, for instance, and it depends on your expected table size and were you are storing it (cloud vs local). Another example, what is true for INSERTS may not be for SORTS, etc. Just don't comes here for those answers.
@@Marfig That's a fair assessment. I'd like to see more on how to use the UUID and for which database types it's useful for.
@@j2csharp Well, to answer your initial question more directly, UUIDs and ULIDs are great for in-memory processing. But terrible for indexing tables. On the case of ULIDs they can also be stored as UNIQUEIDENTIFIERS. They are compatible. But it requires you decoding the ULID string into a byte array and then creating a new GUID using the byte array in the constructor. However you will lose the lexicograph sorting capabilities of the ULID. Not useful at all and you will need more CPU clocks to generate an ID. Storing as a CHAR(26) can be your option if you don't care about clustered index size, or, as you noted, the impact on sort and table inserts. So what should you do? Well, if you just need creation time sorting capabilities, use a CreatedAt column. If you need non-guessable unique IDs use NanoId which allows you to select your key alphabet and size, giving you a lot more room for your table to grow before you start experiencing the same problems as UUIDs. Or use CUID2 which is larger but the most secure of the UID generators. If you need non-guessable sortable IDs, there's no easy way, you'll need to implement a GUID Comb strategy in your code, making sure that the last part of the GUID that contain the timestamp. Or, you use text ULID with the problems you know. So, what is the better option? The thing that these videos keep not telling you, is that it depends on your table, how the data is used, and what is your database. There's no one solution fits all *sigh*
BTW. One last note. When using UUIDs as database primary keys, it's actually bad for a UUID generation to be too fast. They should be fast, just not too fast. And this is particularly true of sortable UUIDs, because they are weaker to guessing and can more easily be the target of brute force attacks. So, as a matter of fact, ULID speed plays against it when using on databases as row identifiers.
Why would you sort for an unique identifier? If we would have to sort for time would'nt we just add a datetime variable?
I made my own Guid V4 because of this video. I did the timestamp similar to V7, but on the right side and included also nanoseconds. This way it's sortable also in SQL-Database and it's unique. The datetime variable is not unique. This sortable unique key is very handy for some tasks and simplifies the tasks. Sometimes I need to know the exact order the records came in. Then I used an order-field with long. The problem of the overflow has to be addressed.
Great presentation! 🎉 I have learnt a lot from it and I really sorry that I didn't know about ulid 10 years ago.
Loved it but I already using a custom version 7 like guid.
What I need is a way to store a 124bit int in Oracle/SQL/PostgreSQL that is good on performance.
Sweet. Thanks.
Do the resulting Guids from Ulids still appear ordered based on time? I'd assume so?
How is the database performance w/ sorting, etc, etc ?
At best it's just as good as an integer, at worst it can take up to 4x the memory. Depends on the DBMS too
Just use ints
I still rather use the ints and I have a simple guid as "unique-id" that I usually use to sync between bases and expose when necessary. My internal Id never leaves my database.
The more we live - the more we learn.
Always a bit funny to find out, that community made something more efficient and useful than Microsoft 😊
I don't know why this always happens ?! 😅 another small example MudBlazor much better than fluent UI. Isn't that funny?! 🤦♂😅
It would be nice to have some sort of conversion between getting the timestamp from UUID v7. For instance, seeing when the ID was created.
Can we trust it? Will this package get support? Will this package validate correctly against buffer overflow? Just because it uses SIMD doesn’t mean it’s “so, so good”, to me “so, so good” means “so, so reliable with such a great support “.
Have you taken a look at their GitHub repo yet?
So ulid will be stored as varchar string in the database ?
Better to convert to guid and store as uniqueidentifier
I"m guessing CHAR(26)
The package comes with byte array converters so you can convert it to 16 bytes or just store the 26 characters of the string
i wish he would have touched on that, however he did talk about the ToGuid() method which can be used to convert in guid for database
I had it on the original notes and I completely forgot 🤦
I think you should also mention how to spot what is a v4 GUID or v7 etc. It’s very easy because there is always a -4 in the middle and so on. Once you see it you will always see it
So glad ULIDs are getting popular. They are really better in every way than GUIDs/UUIDs.
Any downsites?
They are not better in anyway, than any plain unguessable token. Stop sayning what time+rand() is id.
Any good database already generate ids, including sythetic uuids for two decades. Base32 is bad schema when compared to base36 or base62, and both can be implemented in very performant way.
So, compared to GUID is not better: is not GUID, is plain rand token which you cant trust in same way as GUID. So, application is actually possible, but adding custom name like MySUID/ULID did not make it better in any way, than existing id schemes. At least it did not worse to cry what it is better. It is not, it dumb old scheme.
@@ZuvielDrama None except the fact that the time component can sorta gives away the frequency of object generation.
@@NotACat20 Thanks. That rant converted me totally!
@@VanDameDev i glad what at least one people heard me. There is exist similar comments about, and I'm sorry what i ranting - i can do it endlessly. :) probably i am too old and when i was young software had been limited, but overally better. :) Thank you.
My only concern is Collision Resistance, ULID may not be as collision-resistant as GUIDs, especially in highly concurrent systems.
It would be very interesting if we could generate ULIDs directly on T-SQL or even PL/SQL.
We have that with integers and GUIDs, and it's an important feature to have on any B2B software, that eventually requires manipulation of data directly in the database via stored procedures and alikes.
I'm still not sold on any idea that, for non-distributed databases, anything other than integers is better for PKs, but maybe I'm just old.
Please show example with postgres integration for ulid and uuid7
I just was looking for it last week and found the same library.
Is it going be a best practise if I replaced all my GUIDs in SQL database with ULIDs of a type char(26)/varchar(26) knowing that most of those IDs exposed by APIs for CRUD operations? what I understood from the video the performance should be even better and more secure, I'm I correct?
Likely not. GUIDs are stored as 16 byte Ints in MS-SQL, which should perform better than 26 byte varchar. When using GUIDs in MSSQL, they are only a performance problem if they are being generated in .NET (not on the SQL server) and are being used as a clustered index.
@@StateHasChanged Exactly. Clustered Index is the key point here. Having a good level of knowledge around indexing and clustered indexing is important here. Take a look at this video, which is full on DBA territory, but developers need to know how this stuff works in order to get the best out of their software ruclips.net/video/rvZwMNJxqVo/видео.html
Are the Guids generated from a Ulid still sorted by time?
In manycases Ids also have to be created on the database. You would probably need a dll as user efined function to achieve this.
No you wouldn’t. You can just pass the generated id to the database
@@nickchapsas Depends on the circumstances. There might be scenarios where for example a stored procedure or trigger generate rows directly on the database. Or when you are fixing faulty data (should not happen but it does).
@@gliaMe I wouldn't use SPs or Triggers ever
That’s interesting, why would you not use SPs? I use them fairly rarely but I thought they were considered a good way to restrict what a 3rd party can do in your DB?
@@ghaf222 Mostly for calulation intensive business logic and reporting, which would be too slow when not done directly on the database. Admittedly, these are mostle read operations but still, I would not completely exclude such a use case.
Can I replace existing Guid v4 with v7 (or ulid.toguid) on an existing dataset?
How does it compere to the Guid generator from er core?
Will Entity Framework implement v7 by default in a Key property?
Does the ToGuid method map it to a v7 Guid?
A guid is a guid, v4 and v7 are ways of generating an guid. Technically a guid is just 128 bit of (mostly) random data.
@@CrashM85to elaborate on "mostly". The nibble at bit index 48 (0 being leftmost MSB) indicate the Guid version, e.g. the default random v4, and the new v7 time+random.
The ToGuid method doesn't specify the version nibble at all, so it's just going to be whatever was at that location in the Ulid. So about 1/16 of the values would be valid v4 and 1/16 would be valid v7. Technically 1/16 would also be valid v8, since those are domain specific and can be defined any way you want.
quick question with v7 and ulid can you decode the time component back to a datetime?
The Ulid library supports converting to Guid and DateTimeOffset which can be converted to a DateTime.
I'll stick with the industry standard GUID. Still way better....
Quite sure the industry standard is still a numeric value, most often an integer.
@@Timelog88
It depends upon what you call the "industry standard", unique ids are pretty common for microservices and distributed systems.
@@Timelog88Sharding is a thing and ints make that a lot harder (youd have the same int Id in multiple shards)
@@projectuber Yeah but you generally don't use a Guid as shard key and at least with MongoDb the _id is normally a 12 bytes objectId so also not a Guid.
What normally formed databases get fragmented from a guid?
And why would you ever order based on a natural key? The engines don't do it.
Is the course in minimal api or is it the usual controller app?
The vertical slice course uses minimal API's.
@@premiersi Thanks for responding
@@premiersi I just asked because i only care about minimal apis and different ways of using them. I only use dotnet for backend and i put my brackets on the same line of the signature 😂. So im already a blasphemous coder going against the dotnet grain but i love me some minimal apis though 😉. Sorry if i disrespected any tru dotnet people and yes i am using a macos haha
I'm using it in production environment for events ✔
I thought that the performance would be discussed on the DB level. For example, sorting of 10 mln records.
It is still unclear what type should be created in the case of each DB type.
Performance is not really the most important concern when it comes to ID generation. This video's focus on perf was kind of pointless.
Exactly
It depends on use case. It is very much possible that you need to generate many ids at once, and even if this is not the case it's good to see there being interest in making it as fast as can be.
Tell me you lack experience without telling me you lack experience..
Really wonder what they plan to use the extra nanoseconds for. Greatness can be achived in that.
@@MikeZadik If your database id is a uuid/guid, then the perf hit will definitely NOT be nanoseconds / milliseconds
Twitter's Snowflake Ids is another option. More storage efficient.
Ulid was annoying to setup but hey at least it's good for performance.
Does postgres support the type ulid?
No, but one can install third-party extensions like pg-ulid and use a type "ulid" in scripts.
@@MrSupasonik never in production
Does the mssql support this ulid?
I'd not use an external library if a similar feature can be achieved in the standard . NET. Performance in generating a GUID is negligible to compensate for using a third-party library.
Cysharp.Ulid doesn't seem to be thread safe. There's a race condition in RandomProvider+XorShift64.Next(). See below ¹. It's also very simple compared to System.Random, but I don't know if the algorithm is good enough or not. It's probably a large part of the performance improvement overv Guid.
I'm also not sure why they did it like that in the first place. The spec seems to call for it to generate a random value only the first time a Ulid is requested, or first time per ms, then incrementing that value by 1 for each subsequent new Ulid. In short, the spec would have the Ulids in the same ms also in correct sort order, while in the C# version the order of the ids created in the same millisecond are random. It does make it impossible to guess the next ID though, unlike in the spec.
¹
UInt64 x = 88172645463325252UL;
public UInt64 Next()
{
x = x ^ (x > 9);
}
Kevin is an API architecture legend, awesome to see him on Dometrain!
What's wrong with int/long Id as PK in db?
Nothing. However, there are times when you can't use an auto-incrementing number so Guids are often used. The downside is that they are not sequential, so you have to rely on a datetime stamp or something else to maintain chronological order. It's rare for me, but Sitecore is a CMS platform that uses Guid as PK for good reason. Sequence is not an issue for them, so V4 Guids work fine.
Sequences can become a performance bottleneck when inserting a lot of records in the database.
Sometimes one needs to create the Ids before you insert them into dB because they are used in other relations...
@@terjes64 you can use insert returning for that
@@BlTemplarbut why do that when I can generate everything server side and then push it all to the database?
What was a dealbreaker for me deciding not to use that nuget library is that the Ulid type is defined under the System namespace. I mean, WTH!
What datatype would you use to store a ULID is a SQL Server database?
A 16-byte array or 26 CHAR
How does it work with efcore and databases?
It comes with a converter. It’s made for database
So, removing the randomness from GUID is a good thing? Has anyone else thought of an attack that could be assisted by knowing how busy a server is?
Any developer or IT manager who suggests the use of these should be demoted. No one running multiple parallel systems is going to want a GUID that can only be 20/32 as unique as a 32byte GUID.
please elaborate ?
@@stignielsson2697 if you shorten the random part of of a GUID, you remove its randomness and risk duplicate IDs, without some additional checking. No one running a large cluster is going to want to make a DB query before using a GUID.
Video: “Don’t use UUIDs, use these instead”
Me: I swear to god, if this is yet another video bashing UUID for reasons that are either outdated, negligible, or irrelevant to real world applications and recommending yet another replacement with no industry support and no reason to exist other than performance for performance’s sake (and bonus points if their complaint focuses on the string form of UUID even though that isn’t the real UUID representation)…
Video: “Use ULIDs, they are so much faster to generate than UUIDs and are sortable.”
Me: I rest my case, your honor.
With what type is this used in the database, the DB only supports guid. For example, postgres.
If we do ToGuid() before writing to the DB, what is the benefit then?
You do realize that NHibernate and other good ORMs have had strategies for sequential GUIDs for over a decade right? Maybe this is better, but it did the same basic thing.
Just don't store guid as a sring. No any reasons to compare string length.
millisecs level resolution is not sufficient these days. are the ULIDs generated within a millisec still time-sortable? we really should move to nanosec resolution already; there are industries where million messages per second on single CPU core is also turning out to be insufficient.
Can we have a video integrating the Ulid library from Cysharp with EF Core/ Dapper? Author provides some documentation, but I think it can be more described
Thank you for this video! Nick, was waiting for it since your UUID v7 video.
One month later. Nick: forget about UUID v4, v7 and ULID, here is a more newer ID 😅
Is there a reason you need a global unique id over the much simpler autoincrement/identity integer id?
Unless you have multiple databases that can't share the same ids.
My guess here is that it's easier to attack an api/database using integers. Using global unique identifiers help reduce attack surface.
Sequential until it's not... not a super compelling case to switch from v4, I don't think. Just don't cluster on them.
just use ints ffs
I'm curious, why use VC Code if you have access to "full fat" VS?
He's using JetBrains Rider. The black theme does made it look like VS Code
@@thumcheechon5081 This. Also Microsoft keeps adding and enhancing more and more C#/dotnet-features in VS Code, which makes me believe that at some point they will actually phase out VS in favor of VS Code.
Now although VS Code has a lot of practical advantages to VS (its generally faster, runs on all OS'es, is more in open-source space and with that has a very wide user base already), I still prefer 'full fat' VS. Not just for C#/dotnet.
Been a lifelong VS Pro/Ent (20+ years) user until I gave Rider a shot about a year ago, and haven't looked back. It's better in nearly every single way, and less than half the price. Seriously though, I even stopped using SSMS (or PostgresAdmin) and just have a separate Rider instance open for database queries and work. The bells and whistles seem like a lot at first, but once you start getting familiar with them you realize they are each very well tested and thought out.
First here