00:00 - 00:30 intro 00:30 - 02:25 Where to generate resource IDs (auto-incrementing IDs vs UUIDs) 02:25 - 04:05 Generate meaningful identifiers. (Understandable not just readable) 04:05 - 07:37 Provide meaningful response (that's actually pretty nice) 07:37 - 08:52 Prefer returning a JSON object response instead of an array. This way you leave room for extension. 08:52 - 08:52 Refrain from using technical jargons and prefer language of the domain --- 01:32 Even if you're using auto-incrementing IDs, you could save the order with status "pending" and then return that ID. You could still process the order async.
You can read about HATEOAS. I think this is just a part of it. Edit. Sorry you were maybe talking about something later on the video not about action links?
For tip 1, if you are moving the ID up to the API and queuing the persistence, then how errors are handled becomes important and adds complexity. You might be trading the speed of create request time for having clients poll for errors or confirmations the order is created successfully. Does not invalidate the idea, particularly for systems that handle large volumes of requests, but it is a bit of advanced topic/implementation thats needs the developer to apply some thought.
Can you give an example for "You might be trading the speed of create request time for having clients poll for errors or confirmations the order is created successfully." I don't understand how generating the Id sooner influences wether you need to poll for errors or confirmations. Generating the id at API level makes it possible to implement such a solution where you have to poll for errors or confirmations but it doesn't mean you have to.
@@joeke1234 My point is really about synchronous vs asynchronous processing. If the database creates the ID it is as a result of the insert being successful. If you generate if earlier and hand it off for asynchronous processing then you don't know if it was successful.
@@noideaprojects It was the same question that passing in my mind. If I make a meaningful identifier, i also have the responsability of create them and it's a hard work to do when you have large requests and solve multiple async functions.
@@noideaprojects Then you also need to offload data validation from DB to API, which goes against db best practices I guess, not sure. But I am sure that databases already have solutions for the kind of problems that emerge in distributed contexts, eg. unique identifiers across multiple nodes etc...
I mean i guess your not talking about UUID4 : Version 4 (random) Thus, for variant 1 (that is, most UUIDs) a random version-4 UUID will have 6 predetermined variant and version bits, leaving 122 bits for the randomly generated part, for a total of 2122, or 5.3×1036 (5.3 undecillion) possible version-4 variant-1 UUIDs.
an important thing to consider if you add actions, as shown here, is caching - if, for example, the "cancel order" action is only valid for the next 15 minutes, cache expiration should be set accordingly. this can get a little tricky if you have multiple actions with different expiration. part of me honestly likes to avoid this type of problem - in some cases, you might prefer to design endpoints with less data, requiring the client to make multiple, individually-cacheable requests, each endpoint having only one reason to expire and need a refresh. just something to consider. :-)
Do the caching at the application layer with Redis or something instead of at the network layer. Then you can invalidate the cache by firing an event on state change or queue an event for 15 minutes into the future in the case of "this order can only be cancelled if it's been made within 15 minutes"
@@spicynoodle7419 Sure, if you have control of all the clients - so it wouldn't work for something like a public API. I'm a web developer, so I tend to think of solutions that are interoperable at the network layer, such as proxies and cache servers etc. :-)
Because of the different expirations on actions, cant you add the validUntil-field on the Action-object itself? So each action knows until when its valid. The clients should then decide for themselfs how to handle it, but thats none of the API's concern, imho.
I think having actions in responses does not free you from validation on server. Even if browser did return you data from cache and client did see "Cancel order" button, you should not be able to do it. All you need to do, is to gracefully handle 403 (or 409) situation, show user "sorry" message and refresh app state. This case is not so much different from a case when user loaded a page, left for 15 mins to grab a coffee and then clicks the button. But I myself don't do "actions", too much hassle on backend side, don't want to spend server's CPU for this kind of work when. It's easier for me to implement this login on client side. But maybe I will give it a try some day :)
@@Baby4Ghost again, yes, assuming you have control of all the clients. if you're building a public API, that's a problem - if you're building a private backend API for a client app you control, go for it. :-)
Really interesting ideas! If there was one thing about API design I'd really love people to stick to, it would also be just sticking to well known REST principles. May it be the design of the URL, or the response. I've seen so many crude apis which e.g. list getting an account as "api/account/getAccount" instead of just "api/account", and modifying entries of a catalogue as something like "api/catalogue/updateCatalogueWithItem" instead of something like PUT / POSTing to "api/catalogue//items". Responses are the same. We have well defined status codes and http statuscodes, but a lot of apis would rather return a 200 with a body containing custom enums... It's a mess out there. I wish this stuff was more standardized across the board.
It depends on the audience, and your architecture. I agree that if you are going to be providing an API that is being consumed by external parties, using generally accepted REST “best practices” makes a lot of sense. In my company however, the VAST majority of APIs that we create, are never consumed externally. They are only consumed by the UI that customers use. And in these cases we don’t even write backend controllers/endpoints, but rather use a “cqrs” like pattern, where things are named like “CancelOrder, DispatchOrder, FlagOrderForFraud”. Our controllers, and the API clients use mx by the UI, are all source generated, and as such, our endpoints end up being /Orders/CancelOrder, /Orders/DispatchOrder, /Orders/GetOrder?id=123. This is totally fine Agree on the status codes regardless
If you follow REST for everything then you have to do a PATCH /orders/1 with a body of { "command": "cancel" } which doesn't make much sense. I'd rather have a single controller handle a single command in a task-based UI. Here a POST /orders/1/cancel is what I would go for
@@spicynoodle7419 let say the case is liking a blog post with enpoint POST /posts/1/like, if i want to cancel / undo the like is it better to use DELETE /posts/1/like or use POST /posts/1/cancel-like ?
@@fourthkim3715 I would make it /dislike because it's already an explicit action in the UI so it doesn't make sense to make it a generic CRUD call and infer the action in the controller. Do you have /register, /verify-email, /login, /logout, /forgot-password, /new-password? Or do you have some /auth resource that you call with different verbs? Or is it something silly like POST /user for a register, DELETE /session/{userid} for logout? What about changing the order of some drag & drop list. Do you update the list and do if (request has order list)?
@@fourthkim3715 I lean towards a POST for all actions you can do on an object. In case of likes you will probably not only have a table containing likes. Chances are you have a counter keeping track of the amount of likes. In that case you are not only doing a delete but you will also update another record holding the like count.
just don't roll your own "unique" randomly generated identifiers with less entropy than UUIDv4s trying to make them more meaningful. also would be helpful to point out that including the actions in the response is called HATEOAS for those trying to read more about it :)
This is such great advice - everyone needs to watch this, you've covered everything wrong with the AWS public APIs at least. One recommendation/alteration I would like to add, especially if this API is specifically for integration, is to use capabilities as your identifier scheme. That is, the opaque part should have more than 16 bytes of entropy and be cryptographically generated. This reduces the need for complex permissions checking in the server and allows integrations to subset access to your objects without having to proxy all requests. Object-capability security has been a long time coming on the web, but videos like this one give me hope.
@3:00 The topic of "understandable IDs" resonated with me. I've always been keen on designing IDs that provide insight into the data's origin or domain just by their format. However, I've consistently been advised against it. Many blog articles emphasize that IDs shouldn't carry business logic and doing so might even be considered poor practice, since their primary function is to index unique rows in a database.
They aren't mutually exclusive. You can still have a primary key and foreign key for a database be one thing and an ID used for display and other lookup purposes for another. Eg, you mayve have Invoice # CA-ON-123 which has a primary key thats some numeric or guid value.
About meaningful ids, i have experienced a good few of them. All of them were causing trouble. A good example is the ECLI it is an identifier that uniquely identifies court cases in europe. it is used in deeplinks so it can be placed in documents that refer to an other case etc. it looks like this: ECLI:FR:CC:2014:2014.432.QPC you can see FR for France and the CC is the court etc nl.wikipedia.org/wiki/European_Case_Law_Identifier So now why is this bad? The courts are organised in regions and sometimes the regions merge or split. Even countries can be renamed and split and merged. Then the ecli is "wrong" half of the ecli's should go to court A and the other half to court B. but no you can not change the ecli because it is deeplinked. So then here comes the ecli redirect component that keeps track of all renames of ecli's and that can be multiple hops to finally arrive at the original document. Nope just a unique string of maybe 10 characters and numbers 36^10 possible strings would not give this problem.
@@CodeOpinion This is very important in my opinion. Too many times I see primary keys exposed over API-s to public. If these are auto-increment, it will always reveal something about the business, either how many customers, invoices or orders etc there are in the system. This is usually something you never want to expose. Speaking of understandable ID-s, Stripe for example has one of the best out there I have seen.
I pretty much disagree with all of this. A record's ID should ALWAYS be of a consistent affinity across tables. Asynchronously, an autoincrement ID can collide, which is why you should use UUID, GUID or CUID. If you need something human-readable, it should go in a different column, such as "code", (not "order_code", since it exists within the context of an order hence redundant). There is also some advice regarding adding "states" or "actions" being added to a response object. A pure REST API should only return the state of the resource without additional fields. The consuming application should be aware of the possible states, and which actions can be performed under which state, so you can add PUT /orders/:id { "state": "cancelled" } and the API would return the updated resource, or a 415 error if that state can no longer be legally applied to the resource. REST should be a minimal representation of the state of a resource - there must to be some awareness of certain rules/constraints on the client's end whether it's via documentation or something like OpenAPI.
This is great advice full stop! and another reason that's this guidance important is Security!.. Never send sequential ids to the client. we all know that one logged in user should never be able to see another logged users data with all good intentions however recent security breaches demonstrates discoverability of id leads to information leaks to the tune of millions of users. Security is a topic that deserves it's own dedicated attention and let's explore this more.
An extremely impressive video. I hope you would do another like this, where you take an API made by a noob like me and make it better. A few implementations and examples of the points you just mentioned would be very helpful.
🎯 Key Takeaways for quick navigation: 00:41 🆔 *Consider where you generate identifiers (IDs) in a distributed environment to avoid collisions and enable asynchronous processing.* 02:32 🌐 *Generate meaningful identifiers that are human understandable, providing valuable information at a glance.* 04:10 📄 *Include information in API responses about possible actions based on the current state of the system to guide clients on what they can do.* 06:43 🔄 *Embrace evolvability by focusing on actions in responses, reducing client logic changes when business rules evolve.* 08:49 🚫 *Avoid handcuffing yourself by designing responses with flexibility, allowing for additions without breaking backward compatibility.* 09:04 📋 *Use the language of your domain in API design, capturing behaviors and capabilities with terms that make sense within the context.* Made with HARPA AI
Great video. These are weird things that only come up I'm certain scenarios but building with it in mind makes your life alot easier in the future if you were to hit those scenarios. Glad this showed up on my feed. Not a video I would have looked for but one I am really glad I watched.
You're actually describing REST in tip #3 (well, the HATEOAS part of it). It's sad how many people don't know the details of the REST architecture because it is so we'll thought out.
Could you please tell us , about , rate limiting , security , size of an http request , ... how to have an api with high performance and work in high load ...
@@CodeOpinion size of responses is a (big?) issue with mobile apps and APIs that serve thousands of lines (some root aggregate) are a problem for mobile devs ): 'tolerant reader' means slow app - maybe there is some space for tips and tricks in this issue (:
Interesting 3rd point. In the example discussed, the "CancelOrder" action will not be invoked by the caller immediately after the Order object is returned because in the real world customers don't cancel right after they've placed the order. The order will be cancelled sometime in the future and by that time the "state is pending and the order was placed in the last 15 minutes" condition may no longer be true. So does the server returning the cancel action really matter aside from making the route opaque to the client?
You can add a timeststamp field to the CancelOrder request. Edit: Nevermind, that would create a security issue. But you could still create a timestamp internally in the logic of the api code.
@@deviousengineer8398 That would require clients to now be aware of the fact that there can be a timestamp property in the payload, and then update their implementation to take that into account before sending cancellation requests. What if the API evolves to require more business constraints before allowing cancellations in the future? Wouldn't the client-side implementations require an update every time a new business rule is added? So how exactly is evolvability of the API improved by applying this pattern?
I really like tip 3 & 4. Then using tip 4 to include the available status’ was really cool too. 99% of our APIs are consumed by the same developer who writes them, and via a generated client, but I still think this adds value.
Something I would recommend seeing this example is to not call things order->orderID and customer->customerID - it's redundant. It should always (in my opinion) be customer->id and order->id. If you want to have identifying parameters on an object, the advice with a meaningful ID (such as CA-ON-xxx) is good. Stripe also does this (cus_xxyy, pi_xxyy etc.). As an alternative, have a type property with a string that identifies the object's type.
We have a rule that if it's a database concern it shouldn't be exposed in the Api. This helped a lot for people to understand why int ids and even guids were not a good idea and helped in creating real business keys for data that migrate across boundaries actually made sense. I think more content creators should help enlighten the benefits of Hypermedia as the engine of application state (HATEOAS) in proper REST api design. Thanks for sharing.
@@ngugimuchangi5824 apologies, i don't understand how the two affect each other. endpoint actions represent an feature behaviour, and you have as many of those as the feature requires. you don't need an add method to scale to 100... you might need 100 "services" behind the endpoint to process it and that should be able to scale. Can you please clarify your question?
While I used auto-increment id, UUID, etc, I remember considering purpose specific identifiers. I really liked the idea of providing possible actions in the REST API response, nice suggestion, thank you
Lately I've become a huge fan of how Stripe generates ids... Leads with a resource identifier like pi_ for a payment intent, and a nanoid that appears to have the date as part of the hash (so sorting by the id is also sorting by the create date).
this video specially action part on api actually helped me a lot, i was searching for some way to sperate business logic from presentation logic, thanks man
What a great video, thanks for the tips, I thought they were really helpful. Also would love to hear more of your thoughts and tips for distributed systems.
In the example of order canceling, from a UI standpoint, you might want to display a greyed-out cancel button and a reason why an order can't be canceled. If the state was 'shipped' for example, how would you add this? An idea I came up with is the following: ```json { "orders": [ { "orderID": "CA-ON-54812", "status": "shipped", "actions": [ { "name": "CancelOrder", "enabled": false, "reasons": ["status:shipped"] } ] } ] } ``` The front end could then use this reason to find a translation key for the reason it is unavailable, I made it an array in case there might be multiple reasons. Do you think this is a good idea and why/why not? How did you solve this in the past?
the array of actions caught my attention, it is very similar to HATEOAS approach that provide links to clients be able to navigate depending on the state of the server.
This is a great video. Thanks for the insight! What is an in-depth resource for point three? I have some questions about it as far as implementing best practices, such as: - What is the expectation for clients persisting these actions? I've seen a comment about setting cache expiration, but what about for actions without one? Do clients typically store these actions and query it on their side when coming back in a future session? - Correct me if I'm wrong, but it seems like one of the benefits of point three is to essentially self document endpoints. However, how detailed should documentation be external to the response? For example, obviously, a POST endpoint seems like it should be documented as it's basically the first entry point into a service. However, tying into my first question, is it reasonable to expect a client to not need a GET endpoint documented if they're storing the actions on their side? That feels a bit too imposing upon them to me. However, documenting a GET endpoint for an entity eliminates the benefit of obscuring that URI. I'm sure I'll have other questions as I look into this, so a great resource would be much appreciated. Bonus points if it explains it more casually. Technical documentation is tough for me to read, though paired with a more casual resource would be the best. Thanks in advance!
I usually agree with all you say but I would never recommend having meaningful concatenated keys as id, it invites to code that uses substring and split to do actual logic on. A big no no. it also leads to y2k like problems where at some point in time an extra character is needed to contain the future info. also when an order is being redirected to for instance an other area where they have it in stock, you will see stupid "citizen developer" systems like BI reports or power apps that use the substring of the id to calculate/group the profit of an area. if you need meaningfull info on your order just add the property itself in the message "Area" : "Ontario", that is a property that can be updated when needed, you can never update the id.
There's a difference between your storage ID and an ID that you use for query and display purposes. They can be the same, but they don't have to be. I agree you wouldn't want to be parsing a string pull out information for programmatic reasons. As I said in the video it's for "at a glance" for end users to know context. A great example of this is an often an Invoice or PO #. You may persist that in your DB as a GUID/UUID/Auto Increment, whatever, that is also used a foreign key, but you can still have a ID that is more user understood, which often is editable (eg, PO#). Moral of the story, they aren't mutually exclusive. Wish I would of said this in the video, so here's the comment.
Great video. How about the search time on the database if we use uuids as identifiers? Are Databases happier with integers identifiers or they are just fine with uuids too?
Depends on your database ultimately. Some are, some aren't. Fore example, a database with a clustered index is ordered, but for some DB's that doesn't need to be the primary key.
I once had to program something up against an API where practically everything used a different name in the API compared to both the domain and what those things were named on the site I was interacting with. That was not fun, but what was even less fun was how the API usage would break every other time I was not looking at it for some time, because they would have updated something, which would then cause the old usage of the API to break. This means that I would regularly have to go and manually walk through their API with their arcane naming, just to check whether things had gotten broken.
Did they provide it as an open API for you to use? In that case i can agree with you. Otherwise you cant really complain if using an api meant for their specific needs and their clients.
They only had 2 clients, and we were one of those 2. We were directly paying them to develop these things for us. At some point it came up that some access through an API was critical to the use of the program, but that should be fine, since the way they built it they already had an API that their frontend interacted with. It just happened that over time practically everything had been renamed to something different in the frontend than what the backend called it. That and whatever they were doing with the other client would leak bugs in fairly unpredictable ways.
I wish you expanded more on why you believe IDs should be generated high in the stack for the API to be extendable in the future. I understand the advantages you mentioned but they don't seem related to our ability to expand the API in the future in a backward compatible way. In any case, great video! Well done
If, you can only cancel an order if it is in the pending status and was created less than 15 minutes ago. You put the action in to the json at the server end. Now the client gets that json and goes to the bathroom for 20 minutes. Now the action is invalid, but since the client hasn't refreshed their call, they can just go ahead and cancel the order, even though it may be too late or the status has changed.
When that user clicks that button though they now get 400 or something because on the server side there is logic to not allow cancel if older than 15 mins, and the UI would likely have error checking to handle such an outcome though, so I don’t think this is much of a problem?
@@CodeOpinion Actually I didn't want to ask you for a video but I wanted to ask you if you use the APIs with the custom GPT or rather the APIs in general
The thing with the response actions - they're nice, but most clients will ignore them as they likely cannot ingest them. The exception is when you have a configurable, rapidly changing API (though I suggest that is a small subset of cases).
Indeed. There's also a distinction between an API that you consume yourself, versus integration API:s consumed by external systems. I find that it is extremely rare that integrators can or want to use actions/hypermedia.
A tip for IDs: Use a timestamp, concat it with the user id and the return it encoded in base64. Guaranteed unique, cheap to produce, good with utf-8, can be safely done in multiple locations and doesn't give away internal info about volumes. Downside is that it's obviously a bit longer, maybe 40 chars, and isn't human friendly. But who cares?
Like seriously? Why do you think this is better than generating UUID? There is a risk you will generate non-unique identifiers if the timestamp is the same (probably not problem when used miliseconds but I can image some "experts" that will use timestamp down to seconds or even local date (not unique).. just why? you are creating new problems. Stop reinventing the wheel. Need unique identifiers? Use UUID.
@@GondyNM What you've done is half read what I wrote and excitedly rushed to the keyboard. I clearly said to concat it with the user id so dupes are impossible. I also listed the advantages: 1. No race condition workarounds 2. Guaranteed unique 3. Parallelisable 4. Doesn't share usage / volume information 5. Performant I've used this in production with zero issues. My main motivation was point 4, I did it to protect confidential internal data that uuid inadvertently shares. As to timestamps being to the second, that was very silly. Stop and think before you shoot from the hip next time.
Thoughts on designing an API with ephemeral instances vs an API with 'always on workers'? Ephemeral is easier and more scalable, always on has lower latency but is more complex (because the code inside must be async / concurrent).
I would always refer to OpenAPI specifications.. big corp like Microsoft or Google, as well as well build web-based foss projects (unless you want to play around with gRPC or GraphQL)
I don’t think there is problem with auto-increment Id building even-driven systems. Like in a transactional and building financial solution environment, you always have unique identifiers like reference which you already have strategy on how you generate it. IDs for me are just for record arrangement in the dbs I am less concerned about that, has nothing to do with the strategy you wanna use to store your data
The frontend doesn’t process nor construct the URI for the action, it blindly passes it through from the API. Opaque is in relation to the functional structure (eg base, object identifier, action name) that the frontend will then not be aware of as it just sees it as one long string.
i do have some issues with the idea of adding statuses to the list of orders. it feels it breaks the purpose of the endpoint, inflates the payload. i see the benefit of it for rendering the filter without the need to query a different endpoint (GET /order/status) but … that data should be queried only once and then cached, rather than getting it every single time. its not that dynamic that we would need it at every request to be updated… I also would add that not exposing the DB id is a security question too. (and hiding your order volume from your competitors) its just one extra info about your DB which should not be public. (so yeah, use a generated ID) anyway, these are good advices, but i would advise against clumping various list of properties into one call, especially if they are static.
Thanks as always for the great content Derek! Regarding meaningful identifiers, this is an interesting tip. My only concern is to be sure anyone in the organization treats it as identifiers and no more, in that case works like a charm. If, for some reason, in some services, someone adds the logic that splits the ID to pickup "CA" and "ON", evolving the ID and making it have a different structure will cause an unpredictable problem in those services. Have you ever experienced something like this?
Don't like actions part.. given your examples with "timed" cancel available and etc. And it's huge security risk to generate URL's for actions, as you actually injecting paths and methods for your API clients to call. All sorts of validations needs to be done before trusting those values received. DNS/HOSTNAME and etc... or u risk leaking your API tokens too.
From another point of view, how coupled/inefficient would be backend to generate that list of actions available (if domain is complex, that could be 10x more SQL queries or etc)
I think it’s clever and yes under the right conditions it could be helpful. Sending actions to the client does not mean compromise security. Plus you can send a list of available actions which the client can “try” to call but the BE would still verify and authorize.
the message format returned in the cancel order example really reminds me of hal+json. It's a nice iteration from that cuz HAL gets a little crazy honestly
Actions as a part of entities is a good one, I'll probably take it into account! But as for states of some props, I'd rather just go for typeset that I could use on front-end in form of Typescript.
Is there a name to this pattern where you return the "actions" list with what you can do with the order? I really like this idea and want to read more about it
I don't know if these tips are in the context of REST API (I would think so, because of the examples provided). But using verbs instead of nouns for your endpoints is not very REST. Just something I catch in the video
Ever thought of getting a lavalier mic Derek. I feel like you're trying to raise your voice unnaturally to get good audio levels when you're in a wide shot without your mic in frame.
Although passing actions in the response makes the system very extensible, is there a concern of making the API more fragile, especially for a large scale distributed system? If you’re updating business logic, is it better to make this update explicit with a breaking semver change, and blue/green transition the clients?
Uuids also have troubles for databases as the index to find records becomes inefficient for being really inefficient Unless you use uuidv1 or ulid but then you expose a timestamp when a record was created. You could make POST a bit more secure against double adding caused by timeouts if the client generates a uuid and return 409 if they try to use the same uuid again Autoincrement integer is possible in distributed systems if you have them increment not by 1, but by the (maximum) number of database servers
What's your take on string vs numbers? I'm leaning towards using numbers only for things that are explicitly intended to be done math with. Often there are e.g. IDs that happen to be numbers but which are for all intents and purposes just character sequences with no mathematical meaning other than perhaps being sortable in the order in which they were created. I think for that one can make a good case to use strings then also for data that currently happens to be numbers, because that autoincrementing integer might at some point in the future become a UUID or some other human-interpretable id.
I'm not him, only a noob dev here, but I think it should be strictly dependent on whether that column will actually be used mathematically. IDs most likely aren't, so I personally would rather have UUIDs than auto-increment ints; phone numbers are also not used mathematically for example, so just keep them as strings, especially if you think about things like leading zeros, different formats/symbols/etc which might be present in an otherwise generally-thought-of-as-number field
@@pavelyeremenko4640 True, but AFAIK one that only requires sortability. Indexing on string values is common. I'm thinking more of mathematical operations like addition and multiplication. Addition on numbers that represent e.g. sums of money yields meaningful results. Adding one technical row/document id to another does not.
putting meaning into unique ID adds complexity to the system, putting the responsibility for this UID before the persistence also adds a LOT of complexity and the payout of speed is probably mostly negligible - one should REALLY make a lot of PoC and non-functional testing before jumping on that train and ONLY if it's reeeeeally needed. Meaning as a prefix is slow and has a cost, meaning as part of the UID generation calls for a service to create that based on some business logic, so it need rules etc... Trusting some part of the system to create UID and pass it down means creating a come-back mechanism when this calls an error and reading messages and reconstructing events...
Another very cool video. Generating the ID at the controller/API level is an interesting solution, but I'm going to have to agree with one of the points you made later on: context is everything. Out of curiosity, what do you think of Mike Amundsen's "application/vnd.collection+json" media type for an API? Seems like he's developed semantics for a *very* general API whereas you're proposing custom (but consistent semantics) for any given API. I've seen some harsh criticisms of the collection+json (and even agree with them, to a degree). Do you have any criticisms? That would also make a cool video. Thanks for making these :)
HATEOAS over JSON isn't going to happen, you still need developer documentation so listing the URI & method isn't really useful and is actually harder to work with as it isn't a static string, including an OpenAPI operation name as a "rel" tag would be more helpful. It also doesn't convey WHY an action is or isn't available, if I had one complaint about current "REST" tooling, if you wanted to return an error on cancelling an order in that a client can interpret one of n errors, e.g . order is shipped or past 15 minute time limit current tooling sucks on both client and server side.
Of course. The point is you're combining runtime and design time. Not strictly one or the other. You can convey why in your responses. Include the action with no uri but rather a disabled property... Whatever floats your boat. You still need docs on payloads. I do make the operation ID and the action name or link relation the same for reference. Add info at runtime and be explicit to the client rather than all knowledge being decided up front at design time of the client.
Oh don’t get me wrong, 100% for returning that information in the response for runtime use. Probably more venting that theres somehow zero support in ecosystem for anything like this.
I don't understand why in the "order" object you have a property "orderId", it is needlessly redundant. This is much better: { "order": { "id": "...", "total": 198.00 "customer": { "id": "...", "name": "..." } }
Derek, I could not disagree more strongly with tip #2. I have worked many years to get our company’s developers to stop this practice. Here is why: while it might be useful for you to understand the ID, inevitably some developer will parse that ID and use the meaningfulness in their code (even though that same info may be available in the payload). Once this happens, its downhill fast from there. If you change the structure, their code breaks. If you segment like your example (CA-ON-12345), then when you get to the 99999 item, your code breaks. It is just a bad practice and I am really surprised that you offered this as a positive. I love your content and send people to your channel often, but I couldn’t let this one slip by without commenting.
Thanks for the comment. Having identifiers that are meaningful for REPRESENTATION to the user and developers is helpful. That doesn't mean it has to be what your using as the primary key in a database. I wish I would of said this in the video to make this clear as I've got a few comments about it. A great illustration of this is something like an Invoice Your invoice can often be prefixed or gave some meaningful structure to it, but your underlying database has that, but doesn't use that as a primary or foreign key, but everywhere for display purposes you're using that invoice number and even for searching or reference in other places. As to "developers will parse it"... well trying to parse any string that could change is going to be a disaster, if it's careless (whole different topic). I don't often use the term "identifier" or "ID" to represent a primary key exclusively, which I think is probably not normal and that's where the disconnect is. I often have multiple "identifiers" for things, often sometimes they are each unique.
@@CodeOpinion Thanks for the reply. I typically guide people to distinguish between a “business identifier” and a “system identifier”. I will often recommend using the business identifier in URIs (assuming it isn’t PII) for the same reasons you give. Typically, they are well known and printed on some business form or shown on a screen. They can be useful for “priming” the pump when making that first call. I also advocate that each entity within the API have a system identifier that is system-generated and meaningless. Most times, an aggregate will have a business identifier but its subordinate entities will not, so having a system identifier always available is helpful when building other URIs. I have even had people support both the system and business identifier for an aggregate just to provide flexibility. As for bad developer practices, I know you understand them well given your channel! I have found that my guidance often has to steer around common bad practices since there are always new developers coming in who haven’t learned the hard way.
it shouldn't be difficult to prevent programmers from using knowledge about the structure, like the 'CA-ON...' : during testing, distribute some unknown, random identifiers. Code expecting and depending on 'CA-ON' will barf
1 tip was rather useless, because it involved a complete architectural change. From monolith synchronous to async event driven. I also not a big fun of tip 4 either, I think it is a mistake. Add a separate endpoint to return information about the API itself. /orders/ -> list of orders /orders/ -> one order /orders/describe -> info about the api, like the available statuses
Auto incrementing IDs are less hackable? As I've mentioned in other comments, this doesn't need to represent a database key. It can be composed purely at runtime for display.
I'd like to add not to conflate business logic with http status codes.when I see a 404 not found,it should mean I screwed up royally and sent the wrong URL entirely. it Should not mean teehee order not found
I like the idea to keep URI list on the backend side. I see, we can keep all URI needed for client on the server, and preload it once app has been runned. Also, I see crazy idea: you can send all logic to client side from server xD.
Passing actions to the api response is not a good way to do it in my opinion. It gives much complexity to the project. Also it would take time and tests is much harder&complex For example if you can cancel the order in 15 min. User knows the cancel url. If request is made to that endpoint, you have to check and validate first 15 min again. Also that continues. I dont like the approach
@@CodeOpinion don't like DDD, or actually I don't like object centric designs. I guess concept it is inspired from clean architecture, or other concepts that are in a way similar. For me it just an integration point, well designed that is robust, extendable, and easy to interpret.
@@CodeOpinion FWIW, your videos have helped me solidify my understanding and appreciation for DDD. That includes understanding when it's not needed too.
00:00 - 00:30 intro
00:30 - 02:25 Where to generate resource IDs (auto-incrementing IDs vs UUIDs)
02:25 - 04:05 Generate meaningful identifiers. (Understandable not just readable)
04:05 - 07:37 Provide meaningful response (that's actually pretty nice)
07:37 - 08:52 Prefer returning a JSON object response instead of an array. This way you leave room for extension.
08:52 - 08:52 Refrain from using technical jargons and prefer language of the domain
---
01:32 Even if you're using auto-incrementing IDs, you could save the order with status "pending" and then return that ID. You could still process the order async.
very unique tips for api design, this is the first time I have heard of passing the available states in api response. thanks a lot!
Look into HATEOAS
You can read about HATEOAS. I think this is just a part of it.
Edit. Sorry you were maybe talking about something later on the video not about action links?
No I think they were referring to that. But yes, hypermedia.
For tip 1, if you are moving the ID up to the API and queuing the persistence, then how errors are handled becomes important and adds complexity. You might be trading the speed of create request time for having clients poll for errors or confirmations the order is created successfully. Does not invalidate the idea, particularly for systems that handle large volumes of requests, but it is a bit of advanced topic/implementation thats needs the developer to apply some thought.
Can you give an example for "You might be trading the speed of create request time for having clients poll for errors or confirmations the order is created successfully." I don't understand how generating the Id sooner influences wether you need to poll for errors or confirmations. Generating the id at API level makes it possible to implement such a solution where you have to poll for errors or confirmations but it doesn't mean you have to.
@@joeke1234 My point is really about synchronous vs asynchronous processing. If the database creates the ID it is as a result of the insert being successful. If you generate if earlier and hand it off for asynchronous processing then you don't know if it was successful.
@@noideaprojects It was the same question that passing in my mind.
If I make a meaningful identifier, i also have the responsability of create them and it's a hard work to do when you have large requests and solve multiple async functions.
@@noideaprojects Then you also need to offload data validation from DB to API, which goes against db best practices I guess, not sure. But I am sure that databases already have solutions for the kind of problems that emerge in distributed contexts, eg. unique identifiers across multiple nodes etc...
I mean i guess your not talking about UUID4 :
Version 4 (random)
Thus, for variant 1 (that is, most UUIDs) a random version-4 UUID will have 6 predetermined variant and version bits, leaving 122 bits for the randomly generated part, for a total of 2122, or 5.3×1036 (5.3 undecillion) possible version-4 variant-1 UUIDs.
an important thing to consider if you add actions, as shown here, is caching - if, for example, the "cancel order" action is only valid for the next 15 minutes, cache expiration should be set accordingly. this can get a little tricky if you have multiple actions with different expiration. part of me honestly likes to avoid this type of problem - in some cases, you might prefer to design endpoints with less data, requiring the client to make multiple, individually-cacheable requests, each endpoint having only one reason to expire and need a refresh. just something to consider. :-)
Do the caching at the application layer with Redis or something instead of at the network layer. Then you can invalidate the cache by firing an event on state change or queue an event for 15 minutes into the future in the case of "this order can only be cancelled if it's been made within 15 minutes"
@@spicynoodle7419 Sure, if you have control of all the clients - so it wouldn't work for something like a public API. I'm a web developer, so I tend to think of solutions that are interoperable at the network layer, such as proxies and cache servers etc. :-)
Because of the different expirations on actions, cant you add the validUntil-field on the Action-object itself? So each action knows until when its valid. The clients should then decide for themselfs how to handle it, but thats none of the API's concern, imho.
I think having actions in responses does not free you from validation on server. Even if browser did return you data from cache and client did see "Cancel order" button, you should not be able to do it. All you need to do, is to gracefully handle 403 (or 409) situation, show user "sorry" message and refresh app state. This case is not so much different from a case when user loaded a page, left for 15 mins to grab a coffee and then clicks the button.
But I myself don't do "actions", too much hassle on backend side, don't want to spend server's CPU for this kind of work when. It's easier for me to implement this login on client side. But maybe I will give it a try some day :)
@@Baby4Ghost again, yes, assuming you have control of all the clients. if you're building a public API, that's a problem - if you're building a private backend API for a client app you control, go for it. :-)
Really interesting ideas! If there was one thing about API design I'd really love people to stick to, it would also be just sticking to well known REST principles. May it be the design of the URL, or the response. I've seen so many crude apis which e.g. list getting an account as "api/account/getAccount" instead of just "api/account", and modifying entries of a catalogue as something like "api/catalogue/updateCatalogueWithItem" instead of something like PUT / POSTing to "api/catalogue//items". Responses are the same. We have well defined status codes and http statuscodes, but a lot of apis would rather return a 200 with a body containing custom enums... It's a mess out there. I wish this stuff was more standardized across the board.
It depends on the audience, and your architecture.
I agree that if you are going to be providing an API that is being consumed by external parties, using generally accepted REST “best practices” makes a lot of sense.
In my company however, the VAST majority of APIs that we create, are never consumed externally.
They are only consumed by the UI that customers use. And in these cases we don’t even write backend controllers/endpoints, but rather use a “cqrs” like pattern, where things are named like “CancelOrder, DispatchOrder, FlagOrderForFraud”. Our controllers, and the API clients use mx by the UI, are all source generated, and as such, our endpoints end up being /Orders/CancelOrder, /Orders/DispatchOrder, /Orders/GetOrder?id=123.
This is totally fine
Agree on the status codes regardless
If you follow REST for everything then you have to do a PATCH /orders/1 with a body of { "command": "cancel" } which doesn't make much sense. I'd rather have a single controller handle a single command in a task-based UI. Here a POST /orders/1/cancel is what I would go for
@@spicynoodle7419 let say the case is liking a blog post with enpoint POST /posts/1/like, if i want to cancel / undo the like is it better to use DELETE /posts/1/like or use POST /posts/1/cancel-like ?
@@fourthkim3715 I would make it /dislike because it's already an explicit action in the UI so it doesn't make sense to make it a generic CRUD call and infer the action in the controller.
Do you have /register, /verify-email, /login, /logout, /forgot-password, /new-password? Or do you have some /auth resource that you call with different verbs? Or is it something silly like POST /user for a register, DELETE /session/{userid} for logout?
What about changing the order of some drag & drop list. Do you update the list and do if (request has order list)?
@@fourthkim3715 I lean towards a POST for all actions you can do on an object. In case of likes you will probably not only have a table containing likes. Chances are you have a counter keeping track of the amount of likes. In that case you are not only doing a delete but you will also update another record holding the like count.
just don't roll your own "unique" randomly generated identifiers with less entropy than UUIDv4s trying to make them more meaningful.
also would be helpful to point out that including the actions in the response is called HATEOAS for those trying to read more about it :)
Thank you so much!!! Was struggling to find this :)
This is such great advice - everyone needs to watch this, you've covered everything wrong with the AWS public APIs at least.
One recommendation/alteration I would like to add, especially if this API is specifically for integration, is to use capabilities as your identifier scheme. That is, the opaque part should have more than 16 bytes of entropy and be cryptographically generated. This reduces the need for complex permissions checking in the server and allows integrations to subset access to your objects without having to proxy all requests. Object-capability security has been a long time coming on the web, but videos like this one give me hope.
@3:00 The topic of "understandable IDs" resonated with me. I've always been keen on designing IDs that provide insight into the data's origin or domain just by their format. However, I've consistently been advised against it.
Many blog articles emphasize that IDs shouldn't carry business logic and doing so might even be considered poor practice, since their primary function is to index unique rows in a database.
They aren't mutually exclusive. You can still have a primary key and foreign key for a database be one thing and an ID used for display and other lookup purposes for another. Eg, you mayve have Invoice # CA-ON-123 which has a primary key thats some numeric or guid value.
About meaningful ids, i have experienced a good few of them. All of them were causing trouble. A good example is the ECLI it is an identifier that uniquely identifies court cases in europe. it is used in deeplinks so it can be placed in documents that refer to an other case etc. it looks like this: ECLI:FR:CC:2014:2014.432.QPC you can see FR for France and the CC is the court etc nl.wikipedia.org/wiki/European_Case_Law_Identifier So now why is this bad? The courts are organised in regions and sometimes the regions merge or split. Even countries can be renamed and split and merged. Then the ecli is "wrong" half of the ecli's should go to court A and the other half to court B. but no you can not change the ecli because it is deeplinked. So then here comes the ecli redirect component that keeps track of all renames of ecli's and that can be multiple hops to finally arrive at the original document. Nope just a unique string of maybe 10 characters and numbers 36^10 possible strings would not give this problem.
@@CodeOpinion This is very important in my opinion. Too many times I see primary keys exposed over API-s to public. If these are auto-increment, it will always reveal something about the business, either how many customers, invoices or orders etc there are in the system. This is usually something you never want to expose.
Speaking of understandable ID-s, Stripe for example has one of the best out there I have seen.
Why?
Isnt this in conflict with database normalization form 2NF ?@@CodeOpinion
あなたの更新をいつも楽しみにしています!効果的なAPI設計の実践についての洞察が、私のアプローチをより洗練させるきっかけとなりました。EchoAPIを使い始めてから、APIの構造が大きく改善されたことを実感しています。
I pretty much disagree with all of this. A record's ID should ALWAYS be of a consistent affinity across tables. Asynchronously, an autoincrement ID can collide, which is why you should use UUID, GUID or CUID. If you need something human-readable, it should go in a different column, such as "code", (not "order_code", since it exists within the context of an order hence redundant). There is also some advice regarding adding "states" or "actions" being added to a response object. A pure REST API should only return the state of the resource without additional fields. The consuming application should be aware of the possible states, and which actions can be performed under which state, so you can add PUT /orders/:id { "state": "cancelled" } and the API would return the updated resource, or a 415 error if that state can no longer be legally applied to the resource. REST should be a minimal representation of the state of a resource - there must to be some awareness of certain rules/constraints on the client's end whether it's via documentation or something like OpenAPI.
It’s like he is trying to put functionality into api that it wasn’t designed for.
Completelyyyyyyy agree
With an OpenAPI spec all these states, transitions, and so on would be defined-no need to clutter a response with info that’s in the API spec
Sounds like peole forget about hateoas
what if the ID was an auto-incrementing GUID though
This is great advice full stop! and another reason that's this guidance important is Security!.. Never send sequential ids to the client. we all know that one logged in user should never be able to see another logged users data with all good intentions however recent security breaches demonstrates discoverability of id leads to information leaks to the tune of millions of users. Security is a topic that deserves it's own dedicated attention and let's explore this more.
An extremely impressive video. I hope you would do another like this, where you take an API made by a noob like me and make it better. A few implementations and examples of the points you just mentioned would be very helpful.
Great suggestion
🎯 Key Takeaways for quick navigation:
00:41 🆔 *Consider where you generate identifiers (IDs) in a distributed environment to avoid collisions and enable asynchronous processing.*
02:32 🌐 *Generate meaningful identifiers that are human understandable, providing valuable information at a glance.*
04:10 📄 *Include information in API responses about possible actions based on the current state of the system to guide clients on what they can do.*
06:43 🔄 *Embrace evolvability by focusing on actions in responses, reducing client logic changes when business rules evolve.*
08:49 🚫 *Avoid handcuffing yourself by designing responses with flexibility, allowing for additions without breaking backward compatibility.*
09:04 📋 *Use the language of your domain in API design, capturing behaviors and capabilities with terms that make sense within the context.*
Made with HARPA AI
Great video. These are weird things that only come up I'm certain scenarios but building with it in mind makes your life alot easier in the future if you were to hit those scenarios. Glad this showed up on my feed. Not a video I would have looked for but one I am really glad I watched.
You're actually describing REST in tip #3 (well, the HATEOAS part of it). It's sad how many people don't know the details of the REST architecture because it is so we'll thought out.
Yes, I'm describing hypermedia.
Could you please tell us , about , rate limiting , security , size of an http request , ... how to have an api with high performance and work in high load ...
Split read and write with CQRS
Ya good idea, especially rate limiting.
@@CodeOpinion size of responses is a (big?) issue with mobile apps and APIs that serve thousands of lines (some root aggregate) are a problem for mobile devs ):
'tolerant reader' means slow app - maybe there is some space for tips and tricks in this issue (:
Interesting 3rd point.
In the example discussed, the "CancelOrder" action will not be invoked by the caller immediately after the Order object is returned because in the real world customers don't cancel right after they've placed the order. The order will be cancelled sometime in the future and by that time the "state is pending and the order was placed in the last 15 minutes" condition may no longer be true.
So does the server returning the cancel action really matter aside from making the route opaque to the client?
CancelableUntil: Date ;-)
You can add a timeststamp field to the CancelOrder request.
Edit: Nevermind, that would create a security issue. But you could still create a timestamp internally in the logic of the api code.
@@deviousengineer8398 That would require clients to now be aware of the fact that there can be a timestamp property in the payload, and then update their implementation to take that into account before sending cancellation requests.
What if the API evolves to require more business constraints before allowing cancellations in the future? Wouldn't the client-side implementations require an update every time a new business rule is added?
So how exactly is evolvability of the API improved by applying this pattern?
I really like tip 3 & 4. Then using tip 4 to include the available status’ was really cool too.
99% of our APIs are consumed by the same developer who writes them, and via a generated client, but I still think this adds value.
and traffic, that you have to pay
dont send a 404 if the request for the content i am trying to get was successful but just empty, just send an empty array
Something I would recommend seeing this example is to not call things order->orderID and customer->customerID - it's redundant. It should always (in my opinion) be customer->id and order->id. If you want to have identifying parameters on an object, the advice with a meaningful ID (such as CA-ON-xxx) is good. Stripe also does this (cus_xxyy, pi_xxyy etc.). As an alternative, have a type property with a string that identifies the object's type.
We have a rule that if it's a database concern it shouldn't be exposed in the Api. This helped a lot for people to understand why int ids and even guids were not a good idea and helped in creating real business keys for data that migrate across boundaries actually made sense. I think more content creators should help enlighten the benefits of Hypermedia as the engine of application state (HATEOAS) in proper REST api design. Thanks for sharing.
On the matter of HATEOAS, does it scale well as you add more actions or endpoints?
@@ngugimuchangi5824 apologies, i don't understand how the two affect each other. endpoint actions represent an feature behaviour, and you have as many of those as the feature requires. you don't need an add method to scale to 100... you might need 100 "services" behind the endpoint to process it and that should be able to scale.
Can you please clarify your question?
Wouldn’t have thought of a lot of this stuff. Cool video!
While I used auto-increment id, UUID, etc, I remember considering purpose specific identifiers. I really liked the idea of providing possible actions in the REST API response, nice suggestion, thank you
Lately I've become a huge fan of how Stripe generates ids... Leads with a resource identifier like pi_ for a payment intent, and a nanoid that appears to have the date as part of the hash (so sorting by the id is also sorting by the create date).
These are great inputs...would like more of this contents as these are just few.
As a Frontend Developer, I like that array of actions there
this video specially action part on api actually helped me a lot, i was searching for some way to sperate business logic from presentation logic, thanks man
What a great video, thanks for the tips, I thought they were really helpful. Also would love to hear more of your thoughts and tips for distributed systems.
In the example of order canceling, from a UI standpoint, you might want to display a greyed-out cancel button and a reason why an order can't be canceled. If the state was 'shipped' for example, how would you add this?
An idea I came up with is the following:
```json
{
"orders": [
{
"orderID": "CA-ON-54812",
"status": "shipped",
"actions": [
{
"name": "CancelOrder",
"enabled": false,
"reasons": ["status:shipped"]
}
]
}
]
}
```
The front end could then use this reason to find a translation key for the reason it is unavailable, I made it an array in case there might be multiple reasons.
Do you think this is a good idea and why/why not?
How did you solve this in the past?
the array of actions caught my attention, it is very similar to HATEOAS approach that provide links to clients be able to navigate depending on the state of the server.
That's exactly what it is.
The meanful ID CA-ON-#####, how do you prevent collisions across distributed systems? It would be very long to have an id = "CA-ON-" + UUID?
This is a great video. Thanks for the insight!
What is an in-depth resource for point three? I have some questions about it as far as implementing best practices, such as:
- What is the expectation for clients persisting these actions? I've seen a comment about setting cache expiration, but what about for actions without one? Do clients typically store these actions and query it on their side when coming back in a future session?
- Correct me if I'm wrong, but it seems like one of the benefits of point three is to essentially self document endpoints. However, how detailed should documentation be external to the response? For example, obviously, a POST endpoint seems like it should be documented as it's basically the first entry point into a service. However, tying into my first question, is it reasonable to expect a client to not need a GET endpoint documented if they're storing the actions on their side? That feels a bit too imposing upon them to me. However, documenting a GET endpoint for an entity eliminates the benefit of obscuring that URI.
I'm sure I'll have other questions as I look into this, so a great resource would be much appreciated. Bonus points if it explains it more casually. Technical documentation is tough for me to read, though paired with a more casual resource would be the best. Thanks in advance!
I usually agree with all you say but I would never recommend having meaningful concatenated keys as id, it invites to code that uses substring and split to do actual logic on. A big no no. it also leads to y2k like problems where at some point in time an extra character is needed to contain the future info. also when an order is being redirected to for instance an other area where they have it in stock, you will see stupid "citizen developer" systems like BI reports or power apps that use the substring of the id to calculate/group the profit of an area. if you need meaningfull info on your order just add the property itself in the message "Area" : "Ontario", that is a property that can be updated when needed, you can never update the id.
There's a difference between your storage ID and an ID that you use for query and display purposes. They can be the same, but they don't have to be. I agree you wouldn't want to be parsing a string pull out information for programmatic reasons. As I said in the video it's for "at a glance" for end users to know context. A great example of this is an often an Invoice or PO #. You may persist that in your DB as a GUID/UUID/Auto Increment, whatever, that is also used a foreign key, but you can still have a ID that is more user understood, which often is editable (eg, PO#). Moral of the story, they aren't mutually exclusive. Wish I would of said this in the video, so here's the comment.
@@CodeOpinion THis is a great comment I wish this was highlighted more
Great video. How about the search time on the database if we use uuids as identifiers? Are Databases happier with integers identifiers or they are just fine with uuids too?
Depends on your database ultimately. Some are, some aren't. Fore example, a database with a clustered index is ordered, but for some DB's that doesn't need to be the primary key.
Providing actions and Uri is a great idea, haven't seen that anywhere tho.
Great content! I love this! The whole thing makes sense!
I like tip 3. Nice idea. Thanks! Got a project I've just dont part of where this would be useful so I'm going to add it now.
I once had to program something up against an API where practically everything used a different name in the API compared to both the domain and what those things were named on the site I was interacting with. That was not fun, but what was even less fun was how the API usage would break every other time I was not looking at it for some time, because they would have updated something, which would then cause the old usage of the API to break. This means that I would regularly have to go and manually walk through their API with their arcane naming, just to check whether things had gotten broken.
Did they provide it as an open API for you to use? In that case i can agree with you. Otherwise you cant really complain if using an api meant for their specific needs and their clients.
They only had 2 clients, and we were one of those 2. We were directly paying them to develop these things for us. At some point it came up that some access through an API was critical to the use of the program, but that should be fine, since the way they built it they already had an API that their frontend interacted with. It just happened that over time practically everything had been renamed to something different in the frontend than what the backend called it. That and whatever they were doing with the other client would leak bugs in fairly unpredictable ways.
Oh, that sucks!
What is your stance on 'dont use guids as id's in sql server' performance wise? Are sortable guids ok?
Ya it's an issue as a clustered index and depending on your database, might not be really feasible to use (eg, mysql support is not good).
I wish you expanded more on why you believe IDs should be generated high in the stack for the API to be extendable in the future. I understand the advantages you mentioned but they don't seem related to our ability to expand the API in the future in a backward compatible way.
In any case, great video! Well done
Very unusual tips, forcing you to look on your API from absolutely different point of view. Thanks, that made me think
Great to hear!
If, you can only cancel an order if it is in the pending status and was created less than 15 minutes ago. You put the action in to the json at the server end. Now the client gets that json and goes to the bathroom for 20 minutes. Now the action is invalid, but since the client hasn't refreshed their call, they can just go ahead and cancel the order, even though it may be too late or the status has changed.
When that user clicks that button though they now get 400 or something because on the server side there is logic to not allow cancel if older than 15 mins, and the UI would likely have error checking to handle such an outcome though, so I don’t think this is much of a problem?
hello could you help me understand what I could do if I use the APIs in custom gpt
can you give me examples according to the APIs that I would like
That's not really what I do on this channel but who knows maybe some type of ChatGPT API review in the future.
@@CodeOpinion Actually I didn't want to ask you for a video but I wanted to ask you if you use the APIs with the custom GPT or rather the APIs in general
On the first point, I think it’s better to generate a task Id for the entity if you are creating it asynchronously.
The thing with the response actions - they're nice, but most clients will ignore them as they likely cannot ingest them. The exception is when you have a configurable, rapidly changing API (though I suggest that is a small subset of cases).
Indeed. There's also a distinction between an API that you consume yourself, versus integration API:s consumed by external systems. I find that it is extremely rare that integrators can or want to use actions/hypermedia.
this actions thing is actually pretty common with payment provider apis
A tip for IDs: Use a timestamp, concat it with the user id and the return it encoded in base64. Guaranteed unique, cheap to produce, good with utf-8, can be safely done in multiple locations and doesn't give away internal info about volumes.
Downside is that it's obviously a bit longer, maybe 40 chars, and isn't human friendly. But who cares?
Fantastic idea
Like seriously? Why do you think this is better than generating UUID? There is a risk you will generate non-unique identifiers if the timestamp is the same (probably not problem when used miliseconds but I can image some "experts" that will use timestamp down to seconds or even local date (not unique).. just why? you are creating new problems. Stop reinventing the wheel. Need unique identifiers? Use UUID.
@@GondyNM What you've done is half read what I wrote and excitedly rushed to the keyboard. I clearly said to concat it with the user id so dupes are impossible. I also listed the advantages:
1. No race condition workarounds
2. Guaranteed unique
3. Parallelisable
4. Doesn't share usage / volume information
5. Performant
I've used this in production with zero issues. My main motivation was point 4, I did it to protect confidential internal data that uuid inadvertently shares.
As to timestamps being to the second, that was very silly. Stop and think before you shoot from the hip next time.
Thoughts on designing an API with ephemeral instances vs an API with 'always on workers'? Ephemeral is easier and more scalable, always on has lower latency but is more complex (because the code inside must be async / concurrent).
I would always refer to OpenAPI specifications.. big corp like Microsoft or Google, as well as well build web-based foss projects (unless you want to play around with gRPC or GraphQL)
Great API Design tips! Thank you!
Thanks so much. I like your teaching style.
Thank you! 😃
I don’t think there is problem with auto-increment Id building even-driven systems. Like in a transactional and building financial solution environment, you always have unique identifiers like reference which you already have strategy on how you generate it. IDs for me are just for record arrangement in the dbs I am less concerned about that, has nothing to do with the strategy you wanna use to store your data
Where can I find online jobs to build API endpoints using Django Rest Framework?
Didn't quite understand what you meant by "URI is completely opaque to the client". Can someone explain? Thanks!
The frontend doesn’t process nor construct the URI for the action, it blindly passes it through from the API. Opaque is in relation to the functional structure (eg base, object identifier, action name) that the frontend will then not be aware of as it just sees it as one long string.
i do have some issues with the idea of adding statuses to the list of orders. it feels it breaks the purpose of the endpoint, inflates the payload.
i see the benefit of it for rendering the filter without the need to query a different endpoint (GET /order/status)
but … that data should be queried only once and then cached, rather than getting it every single time. its not that dynamic that we would need it at every request to be updated…
I also would add that not exposing the DB id is a security question too. (and hiding your order volume from your competitors)
its just one extra info about your DB which should not be public.
(so yeah, use a generated ID)
anyway, these are good advices, but i would advise against clumping various list of properties into one call, especially if they are static.
Passing states and actions in the API response seems nonstandard. What id you want to refactor code? Wont that incur a significant time loss?
Thanks as always for the great content Derek!
Regarding meaningful identifiers, this is an interesting tip. My only concern is to be sure anyone in the organization treats it as identifiers and no more, in that case works like a charm.
If, for some reason, in some services, someone adds the logic that splits the ID to pickup "CA" and "ON", evolving the ID and making it have a different structure will cause an unpredictable problem in those services. Have you ever experienced something like this?
Just came across the ID issue… would love to understand different strategies for generating order IDs in a distributed database
Good suggestion. Possibly in a more detailed video
+1
This is excellent, as always!
The actions are added to the model of order? Im confused 😢
The response of the request contains the affordances (actions) that can be taken based on the current state.
Thanks Derek for the tips ❤
Don't like actions part.. given your examples with "timed" cancel available and etc. And it's huge security risk to generate URL's for actions, as you actually injecting paths and methods for your API clients to call. All sorts of validations needs to be done before trusting those values received. DNS/HOSTNAME and etc... or u risk leaking your API tokens too.
From another point of view, how coupled/inefficient would be backend to generate that list of actions available (if domain is complex, that could be 10x more SQL queries or etc)
I think it’s clever and yes under the right conditions it could be helpful. Sending actions to the client does not mean compromise security. Plus you can send a list of available actions which the client can “try” to call but the BE would still verify and authorize.
🎉 thanks for your providing your insights! Really liked tip 3 and 5!
the message format returned in the cancel order example really reminds me of hal+json. It's a nice iteration from that cuz HAL gets a little crazy honestly
Actions as a part of entities is a good one, I'll probably take it into account!
But as for states of some props, I'd rather just go for typeset that I could use on front-end in form of Typescript.
Is there a name to this pattern where you return the "actions" list with what you can do with the order? I really like this idea and want to read more about it
Hypermedia as the engine of application state (HATEOAS)
I don't know if these tips are in the context of REST API (I would think so, because of the examples provided). But using verbs instead of nouns for your endpoints is not very REST. Just something I catch in the video
I mentioned in the video that I wasn't talking about REST because most peoples and the general interpretations of REST is an HTTP CRUD API with JSON.
@@CodeOpinion oh I missed that. My bad!
Ever thought of getting a lavalier mic Derek. I feel like you're trying to raise your voice unnaturally to get good audio levels when you're in a wide shot without your mic in frame.
I have thought about it, but I don't think I'm trying to project my voice louder. Still might be a good idea to get one.
Although passing actions in the response makes the system very extensible, is there a concern of making the API more fragile, especially for a large scale distributed system?
If you’re updating business logic, is it better to make this update explicit with a breaking semver change, and blue/green transition the clients?
I was hoping to hear something about transactions and rolling back in case of network failures, etc.
I have many other videos that talk about failures and handling them in various scenarios.
I live in Ontario, Canada too! :)
Uuids also have troubles for databases as the index to find records becomes inefficient for being really inefficient Unless you use uuidv1 or ulid but then you expose a timestamp when a record was created. You could make POST a bit more secure against double adding caused by timeouts if the client generates a uuid and return 409 if they try to use the same uuid again
Autoincrement integer is possible in distributed systems if you have them increment not by 1, but by the (maximum) number of database servers
Interesting approach!
If you're generating the IDs at the API level, how do you know that you're not generating a duplicate ID?
That can be guaranteed by UUID
You have my follow good man
What's your take on string vs numbers?
I'm leaning towards using numbers only for things that are explicitly intended to be done math with. Often there are e.g. IDs that happen to be numbers but which are for all intents and purposes just character sequences with no mathematical meaning other than perhaps being sortable in the order in which they were created.
I think for that one can make a good case to use strings then also for data that currently happens to be numbers, because that autoincrementing integer might at some point in the future become a UUID or some other human-interpretable id.
I'm not him, only a noob dev here, but I think it should be strictly dependent on whether that column will actually be used mathematically. IDs most likely aren't, so I personally would rather have UUIDs than auto-increment ints; phone numbers are also not used mathematically for example, so just keep them as strings, especially if you think about things like leading zeros, different formats/symbols/etc which might be present in an otherwise generally-thought-of-as-number field
Just remember that indexing is in a sense a mathematical operation
@@pavelyeremenko4640 True, but AFAIK one that only requires sortability. Indexing on string values is common.
I'm thinking more of mathematical operations like addition and multiplication. Addition on numbers that represent e.g. sums of money yields meaningful results. Adding one technical row/document id to another does not.
@@jenswurm I more so meant the cost of the operation
@@pavelyeremenko4640It'd probably be one datatype conversion into the internal indexing schema.
putting meaning into unique ID adds complexity to the system, putting the responsibility for this UID before the persistence also adds a LOT of complexity and the payout of speed is probably mostly negligible - one should REALLY make a lot of PoC and non-functional testing before jumping on that train and ONLY if it's reeeeeally needed. Meaning as a prefix is slow and has a cost, meaning as part of the UID generation calls for a service to create that based on some business logic, so it need rules etc... Trusting some part of the system to create UID and pass it down means creating a come-back mechanism when this calls an error and reading messages and reconstructing events...
Another very cool video. Generating the ID at the controller/API level is an interesting solution, but I'm going to have to agree with one of the points you made later on: context is everything.
Out of curiosity, what do you think of Mike Amundsen's "application/vnd.collection+json" media type for an API? Seems like he's developed semantics for a *very* general API whereas you're proposing custom (but consistent semantics) for any given API. I've seen some harsh criticisms of the collection+json (and even agree with them, to a degree). Do you have any criticisms? That would also make a cool video.
Thanks for making these :)
Great content 👍🏼
Glad you enjoyed it
HATEOAS over JSON isn't going to happen, you still need developer documentation so listing the URI & method isn't really useful and is actually harder to work with as it isn't a static string, including an OpenAPI operation name as a "rel" tag would be more helpful.
It also doesn't convey WHY an action is or isn't available, if I had one complaint about current "REST" tooling, if you wanted to return an error on cancelling an order in that a client can interpret one of n errors, e.g . order is shipped or past 15 minute time limit current tooling sucks on both client and server side.
Of course. The point is you're combining runtime and design time. Not strictly one or the other. You can convey why in your responses. Include the action with no uri but rather a disabled property... Whatever floats your boat. You still need docs on payloads. I do make the operation ID and the action name or link relation the same for reference. Add info at runtime and be explicit to the client rather than all knowledge being decided up front at design time of the client.
Oh don’t get me wrong, 100% for returning that information in the response for runtime use. Probably more venting that theres somehow zero support in ecosystem for anything like this.
I don't understand why in the "order" object you have a property "orderId", it is needlessly redundant. This is much better:
{
"order": {
"id": "...",
"total": 198.00
"customer": {
"id": "...",
"name": "..."
}
}
Derek, I could not disagree more strongly with tip #2. I have worked many years to get our company’s developers to stop this practice. Here is why: while it might be useful for you to understand the ID, inevitably some developer will parse that ID and use the meaningfulness in their code (even though that same info may be available in the payload). Once this happens, its downhill fast from there.
If you change the structure, their code breaks. If you segment like your example (CA-ON-12345), then when you get to the 99999 item, your code breaks. It is just a bad practice and I am really surprised that you offered this as a positive.
I love your content and send people to your channel often, but I couldn’t let this one slip by without commenting.
Thanks for the comment. Having identifiers that are meaningful for REPRESENTATION to the user and developers is helpful. That doesn't mean it has to be what your using as the primary key in a database. I wish I would of said this in the video to make this clear as I've got a few comments about it. A great illustration of this is something like an Invoice Your invoice can often be prefixed or gave some meaningful structure to it, but your underlying database has that, but doesn't use that as a primary or foreign key, but everywhere for display purposes you're using that invoice number and even for searching or reference in other places. As to "developers will parse it"... well trying to parse any string that could change is going to be a disaster, if it's careless (whole different topic). I don't often use the term "identifier" or "ID" to represent a primary key exclusively, which I think is probably not normal and that's where the disconnect is. I often have multiple "identifiers" for things, often sometimes they are each unique.
@@CodeOpinion Thanks for the reply. I typically guide people to distinguish between a “business identifier” and a “system identifier”. I will often recommend using the business identifier in URIs (assuming it isn’t PII) for the same reasons you give. Typically, they are well known and printed on some business form or shown on a screen. They can be useful for “priming” the pump when making that first call.
I also advocate that each entity within the API have a system identifier that is system-generated and meaningless. Most times, an aggregate will have a business identifier but its subordinate entities will not, so having a system identifier always available is helpful when building other URIs. I have even had people support both the system and business identifier for an aggregate just to provide flexibility.
As for bad developer practices, I know you understand them well given your channel! I have found that my guidance often has to steer around common bad practices since there are always new developers coming in who haven’t learned the hard way.
it shouldn't be difficult to prevent programmers from using knowledge about the structure, like the 'CA-ON...' : during testing, distribute some unknown, random identifiers.
Code expecting and depending on 'CA-ON' will barf
1 tip was rather useless, because it involved a complete architectural change. From monolith synchronous to async event driven. I also not a big fun of tip 4 either, I think it is a mistake. Add a separate endpoint to return information about the API itself.
/orders/ -> list of orders
/orders/ -> one order
/orders/describe -> info about the api, like the available statuses
I would caution generating meaningful identifiers. You don't want to expose private information via your primary key that's exposed in your client.
Instant subscribe.
Quite interesting 🤗😀
Great tips thank you
Wouldn't meaningful IDs make your table non 1NF due to non atomicity even if it's just informative?
Super useful
As a software engineer, I don't really buy any of these man
Develop for Juniors... Seniors will love you.
Meaningful IDs make hacking attempts much easier. Use other fields that aren't used as IDs.
Auto incrementing IDs are less hackable? As I've mentioned in other comments, this doesn't need to represent a database key. It can be composed purely at runtime for display.
I'd like to add not to conflate business logic with http status codes.when I see a 404 not found,it should mean I screwed up royally and sent the wrong URL entirely. it Should not mean teehee order not found
I like the idea to keep URI list on the backend side. I see, we can keep all URI needed for client on the server, and preload it once app has been runned. Also, I see crazy idea: you can send all logic to client side from server xD.
Another one is using numbers for statuses rather than words
Passing actions to the api response is not a good way to do it in my opinion. It gives much complexity to the project. Also it would take time and tests is much harder&complex
For example if you can cancel the order in 15 min. User knows the cancel url. If request is made to that endpoint, you have to check and validate first 15 min again.
Also that continues. I dont like the approach
And number #6: "When in doubt - leave it out"
I'm usually a hater when it comes to your DDD content, but this is brilliant, keep it up.
Why a hater on that content, curious? Also a lot of what I'm describing in the video is somewhat correlated.
@@CodeOpinion don't like DDD, or actually I don't like object centric designs. I guess concept it is inspired from clean architecture, or other concepts that are in a way similar. For me it just an integration point, well designed that is robust, extendable, and easy to interpret.
@@CodeOpinion FWIW, your videos have helped me solidify my understanding and appreciation for DDD. That includes understanding when it's not needed too.
Tips 1: get database data from FE
Good mind bending
Great Tips . I don't know anyone said to you this. I definitely think if you had a superpower that would be turning into a werewolf
Thats why i love HATEOAS!